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ABSTRACT 

We fit 54,296 sparsely-sampled asteroid lightcurves in the Palomar Transient Factory (PTE) survey to a 
combined rotation plus phase-function model. Each lightcurve consists of 20 or more observations acquired 
in a single opposition. Using 805 asteroids in our sample that have reference periods in the literature, we find 
the reliability of our fitted periods is a complicated function of the period, amplitude, apparent magnitude and 
other lightcurve attributes. Using the 805-asteroid ground-truth sample, we train an automated classifier to 
estimate (along with manual inspection) the validity of the remaining ^53,000 fitted periods. By this method 
we find 9,033 of our lightcurves (of ~8,300 unique asteroids) have ‘reliable’ periods. Subsequent consideration 
of asteroids with multiple lightcurve fits indicate a 4% contamination in these ‘reliable’ periods. For 3,902 
lightcurves with sufficient phase-angle coverage and either a reliably-fit period or low amplitude, we examine 
the distribution of several phase-function parameters, none of which are bimodal though all correlate with the 
bond albedo and with visible-band colors. Comparing the theoretical maximal spin rate of a fluid body with 
our amplitude versus spin-rate distribution suggests that, if held together only by self-gravity, most asteroids 
are in general less dense than ~2 g/cm^, while C types have a lower limit of between 1 and 2 g/cm^. These 
results are in agreement with previous density estimates. For 5-20 km diameters, S types rotate faster and have 
lower amplitudes than C types. If both populations share the same angular momentum, this may indicate the 
two types’ differing ability to deform under rotational stress. Lastly, we compare our absolute magnitudes (and 
apparent-magnitude residuals) to those of the Minor Planet Center’s nominal {G = 0.15, rotation-neglecting) 
model; our phase-function plus Fourier-series fitting reduces asteroid photometric RMS scatter by a factor ^3. 
Subject headings: surveys — minor planets, asteroids: general — solar system: general 


1. INTRODUCTION 

In this work we model an asteroid’s apparent visual magni¬ 
tude V (log flux) as 

V = H 4-(5-h51ogio(rA) - 2.5 logio[</>(Q!)], (1) 

where H is the absolute magnitude (a constant), <5 is a periodic 
variability term due to rotation {e.g., if the object is spinning 
and has some asymmetry in shape or albedo), r and A are 
the heliocentric and geocentric distances (in AU), and (p = 
(j>{a) is the phase function, which varies with the solar phase 
angle a (the Sun-asteroid-Earth angle). When a = 0 (i.e., at 
opposition), </> = 1 by definition, while in general 0 < cp < 1 
for a > 0 (with p decreasing as a increases). 

A key feature of our approach is the simultaneous fitting 
of both the phase function f and the rotation term S. The 
detailed forms of f and 5, as well as the algorithm underlying 
our fitting procedure, are motivated by a variety of prior work 
in this area, as described in the following sections. 

1.1. Asteroid rotation 

Building upon the work of Kaasalainen et al. (2001), Hanus 
& Durech (2012) discuss the inversion of asteroid lightcurve 
data taken over several oppositions to obtain a 3D shape solu¬ 
tion. The form of S (cf. Equation [1]) in this case consists of 


a large number of free parameters (several tens to hundreds). 
Results from inversion agree well with those from stellar oc- 
cultations, adaptive optics imaging, and in-situ spacecraft im¬ 
agery (Hanus et al. 2013). Knowledge of the detailed irregular 
shapes of asteroids improves our ability to constrain models 
of their internal structure, as well the magnitude and timescale 
of spin and orbital evolution due to solar-radiation and ther¬ 
mal emission, including the Yarkovsky and YORP effects (see 
Bottke et al. 2006 and references therein). 

A simpler model for S —suitable for fitting to data sparser 
than that required for most inversion methods—is a Jacobi el¬ 
lipsoid (Chandrasekhar 1969) in its principal-axis spin state. 
The lightcurve of such an ellipsoid is a double-peaked sinu¬ 
soid, given by a simple expression depending solely (assum¬ 
ing constant surface albedo) on the axes ratio, and angle be¬ 
tween the line of sight and spin axis. The fitted amplitude thus 
yields a lower-bound elongation estimate for the asteroid. 

The predicted distribution of the rotation frequencies of a 
collisionally-equilibrated system of particles has long been 
claimed to be a Maxwellian function (Salo 1987), which— 
as reviewed by Pravec et al. (2002)—very well approximates 
the observed distribution of several hundred of the brightest 
(~40-km or larger) asteroids, but breaks down for smaller 
objects, among which an excess of slow and fast rotators 
appear to exist. Steinberg & Sari (2015) more recently ar- 


2 


Waszczak et al. 


gue that collision instead leads to a Levy distribution, and 
that a significant primordial spin component remains in the 
present observed population. Some studies that have exam¬ 
ined the spin distribution of small objects are Pravec et al. 
(2008), Polishook & Brosch (2009), the Thousand Asteroid 
Lightcurve Survey (Masiero et al. 2009), and two brief ob¬ 
serving runs conducted within the PTF survey (Polishook et 
al. 2012; Chang et al. 2014a). 

Warner et al. (2009) describe the Lightcurve Database 
(LCDB), which compiles several thousand densely-sampled 
lightcurves of asteroids targeted by dedicated observing 
teams. Lightcurves in the LCDB have the following features; 

1. LCDB lightcurves’ dense sampling generally permits 
fitting of Fourier series with many harmonic terms, 

2. LCDB lightcurves are often sampled over the short¬ 
est time window necessary to measure the period, and 
therefore generally do not require large or uncertain 
corrections due to phase angle effects, 

3. LCDB lightcurves’ fitted periods are assigned integer 
quality codes by a human reviewer (from 1 = poor to 3 
= confident). 

All three of the above features are either impractical or in¬ 
feasible when the set of lightcurves is very large and the data 
sparsely sampled, as is the case for PTF. In this work we adopt 
the following modified approaches when fitting lightcurves; 

1. We truncate the rotation curve’s Fourier-series fit after 
the 2nd harmonic, a simplification broadly justified by 
Harris et al. (2014) and the assumption of an ellipsoidal 
shape (cf. Section 3.1.2), 

2. We simultaneously fit a phase-function model with the 
rotational part, 

3. We use a machine-learned classifier to objectively aid 
in estimating the validity of each fitted period. The 
classifier is trained using all fitted lightcurves that have 
previously (and confidently) measured LCDB periods 
and takes into account the accuracy with which the true 
period was retrieved along with 20 lightcurve metrics 
(fitted period, amplitude, ratio of peaks, per degree 
of freedom of fit, number of data points, and more). 

Use of a machine classifier in asteroid lightcurve period 
quality assessment is entirely novel and inspired in part by 
work done by PTF collaborators in extragalactic transient sci¬ 
ence (Bloom et al. 2012) and variable star science (Masci et al. 
2014; Miller et al. 2014), as well as Waszczak et al. (in prep)’s 
work on detection techniques for streaking NEOs. Among 
the advantages of using a machine-classified quality score is 
that, via cross-validation with the known-period sample, one 
estimates the completeness and contamination, Le., the true¬ 
positive and false-positive rates with respect to identifying an 
accurately-fit period, as a function of, e.g., the period, ampli¬ 
tude, etc. The resulting true- and false-positive rates may then 
be used to de-bias the classifier-filtered period distribution. 

1.2. Asteroid phase functions 

The analytic phase function of an ideal Lambertian- 
scattering sphere fits well to featureless, atmospheric planets 
like Venus, but quite poorly to airless bodies (see Figure 3.9 of 
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588 Achilles 
884 Priamus 
1143 Odysseus 
24 Themis 
165 Loreley 
211 Isolde 
20 Massalla 
249 Amphitrite 
44 Nysa 
64 Angelina 


Shevchenko et al. 2012 
Shevchenko et al. 2012 
Shevchenko et al. 2012 
Harris et al. 1989a 
Harris et al. 1992 
Harris & Young 1989 
Gehrels 1956 
Lupishko et al. 1981 
Harris et al. 1989b 
Harris et al. 1989b 


Figure 1. Phase curves (from the literature) containing densely-sampled, 
rotation-coiTected photometry of asteroids in four taxonomic classes. Col¬ 
ored lines are our original fits to the data using various single-parameter 0 
models (cf. Section 3.2). 


Seager 2010 for a comparison). In later sections we describe 
several (p models that have been derived for (or empirically fit 
to) asteroids. Qualitatively, asteroids show an approximately 
linearly decreasing f out to a « 100 deg, modified by a surge 
(increase in slope) at low phase angles (a < 5 deg), known as 
the opposition effect (see Figure 1). 

Early work (e.g. Bowell et al. 1989 and refs, therein) on a 
small sample of well-observed asteroids, suggested that dif¬ 
ferent asteroid spectral types display distinct behavior in f. 
Eigure 1 compares example phase curve data for D, C, S and 
E types', incorporating photometry from various sources. We 
emphasize the fact that all of the data points in Eigure 1 have 
been corrected for rotational modulation (the 6 in Equation 
[1]) through dense sampling of each asteroid’s lightcurve at 
each phase angle (equivalently, each epoch). 

Using a large corpus of low-precision photometry from the 

* Bus et al. (2002) review these and other asteroid taxonomic classes, 
which are defined on the basis of low-resolution {R ^ 100) visible re¬ 
flectance spectra. 
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MPC^, Oszkiewicz et al. (2011, 2012) showed that a fitted 
parameter of one particular (p model correlates well with an 
asteroid’s SDSS visible color. While they were unable to cor¬ 
rect for rotational variation (5-term in Equation [1]), the Os¬ 
zkiewicz et al. work nevertheless demonstrates a solid trend 
between p and a compositional attribute (color). 

These prior works motivate several defining aspects of this 
work’s phase-function analysis: 

1. We ht multiple phase function models to each 
lightcurve, both for compatibility with the literature and 
to explore how the fitted parameters are related, 

2. We simultaneously fit the rotational component with 
the phase-function part, 

3. We introduce a single colorimetric index for quantify¬ 
ing C-type vs. S-type taxonomic classihcation, based 
on the compilation of several visible-band-color aster¬ 
oid datasets (see Appendix), and examine the variation 
in phase-function parameters as a function of this color 
index. 

2. OBSERVATIONS 
2.1. Overview of the PTF survey 

The Palomar Transient Factory^ (PTF) is a synoptic survey 
designed primarily to discover extragalactic transients (Law et 
al. 2009; Rau et al. 2009). The PTF camera, mounted on Palo¬ 
mar Observatory’s 1.2-m Oschin Schmidt Telescope, uses 11 
CCDs (each 2K x 4K) to image 7.3 deg^ of sky at a time at 
l.O'Vpixel resolution. Most exposures (^85%) use a Mould- 
R hlter"^ (hereafter “i?”). The remaining broadband images 
acquired use a Gunn p-band filter. Nearly all broadband PTF 
images are 60-second integrations, regardless of filter. About 
15% of nights (near full moon) are devoted to a narrowband 
(Ha) imaging survey of the full Northern Sky. 

Science operations began in March 2009, with a nominal 
one- to five-day cadence for supernova discovery and typical 
twice-per-night imaging of helds. Median seeing is 2" with 
a limiting magnitude R « 20.5 (for 5cr point-source detec¬ 
tions), while dark conditions routinely yield R « 21.0 (Law 
et al. 2010). 

The PTF survey is ongoing and expected to continue 
through mid-2016. In January 2013 the PTF project formally 
entered a second phase called the intermediate PTF (‘iPTF’; 
Kulkarni 2013). In this paper we simply use ‘PTF’ to mean 
the entire survey, from 2009 through the present (2015). The 
iPTF program accommodates more varied ‘sub-surveys’ as 
opposed to a predominantly extragalactic program, including 
variable star and solar system science. Images are still ac¬ 
quired with the same telescope/camera/hlters with 60s expo¬ 
sures, and are processed by the same reduction pipeline. 

Laher et al. (2014) describe the PTF data reduction and 
archiving pipelines, hosted at the Infrared Processing and 
Analysis Center (IPAC) at Caltech. Processing at IPAC in¬ 
cludes bias and flat-held corrections, astrometric calibration 
against UCAC3 (Zacharias 2010), astrometric verification 
against 2MASS (Skrutskie et al. 2006), creation of source cat¬ 
alogs with Source Extractor (Bertin and Arnouts 1996), and 

^ lAU Minor Planet Center, http://minorplanetcenter.net 

^ http://ptf.caltech.edu 

The Mould-i? filter is very similar to the SDSS-r filter; see Ofek et al. 
(2012a) for its transmission curve. 


production of reference images (stacks of ^20-30 PTF im¬ 
ages that reach V « 22). 

Ofek et al. (2012a, 2012b) describe the PTF survey’s ab¬ 
solute photometric calibration method, which relies on source 
matching with SDSS DR7 (Abazajian et al. 2009), and thus 
requires at least partial overlap of PTF with SDSS each 
night. A separate, relative photometric calibration (based on 
lightcurves of non-variable held stars) also exists for PTF data 
and is described by Levitan et al. (2011) and in the appendix 
of Ofek et al. (2011). In this work we utilize all i?-band and 
p-band PTF data accumulated from the survey’s start (March 
2009) through July 2014. The asteroid magnitudes reported in 
this work use relative photometric zeropoints when available 
(which as of this writing applies to ^85% of PTF images) and 
absolute photometric zeropoints otherwise. 

The PTF’s robotic survey program and processing pipeline, 
as well as our data aggregation and analysis in this work, 
make use of many functions from the MATLAB package for 
astronomy and astrophysics (Ofek et al. 2014). 

2.2. This work’s data set 

Waszczak et al. (2013) used a custom spatial indexing al¬ 
gorithm to search the set of all PTF single-epoch transient 
detections (through July 2012) for detections of all asteroids 
with orbits known as of August 2012. That search procedure 
first generated uniformly-spaced ephemerides for each aster¬ 
oid using JPL’s online service (HORIZONS; Giorgini et al. 
1996). Each asteroid’s ephemeris defines a 3D-curve (two 
sky coordinates plus one time); the intersection of each curve 
with the 3D kd-tree of transient detections was then computed 
and positive detections within a 4" matching radius saved. 

In this work we use a modified version of the Waszczak et 
al. (2013) algorithm. The updates/changes are as follows. 

Firstly, in terms of content, we now search all PTF {R and 
p-band) data from Ol-March-2009 through 18-July-2014 for 
all numbered asteroids as of 12-July-2014 (401,810 objects). 
We now exclude unnumbered objects as the positional uncer¬ 
tainty of these objects can be very large, and as they tend to 
be very faint their lightcurves will not in general be of high 
quality. 

Secondly, in place of a single-step matching of a 3D 
transient-detection kd-tree against 3D ephemeris curves, we 
now divide the search into two main steps. We first perform a 
2D spatial matching that exploits the natural indexing of PTF 
exposures into tiles {i.e., the grid of evenly spaced boresights 
or ‘fields’ on the sky). Each 2D ephemeris curve’s intersection 
with the 2D PTF survey footprint is computed, the object’s po¬ 
sition cubically-interpolated to all epochs of exposures possi¬ 
bly containing the object, and the object’s precisely-computed 
positon is then compared to the precise image boundaries of 
candidate exposures. Matching of predicted positions against 
actual detections takes place subsequently as source catalogs 
are then loaded into memory (as needed and in parallel). 
This method is faster than the original Waszczak et al. (2013) 
method and enables separate logging of predicted and positive 
detections. 

The results of the known-asteroid search, as well as the 
derived lightcurve data (described later) are stored in a re¬ 
lational database, the size and contents of which are summa¬ 
rized in Table 1. Out of ~ 18 million predicted single-epoch 
asteroid sightings (including predicted magnitudes as dim as 
V « 23, well below PTF’s sensitivity), there were 8.8 million 
positive detections (within a 4" radius). Of these, we define 
4.3 million detections as ‘reliable’ as they (1) lack any cat- 
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Figure 2. Compaiison of predicted asteroid sightings against positive and ‘reliable’ asteroid detections. We define a ‘reliable’ detection as any positive detection 
which (1) lacks any catalogued background sources within a 4" radius, (2) has a calibrated magnitude uncertainty of less than 0.1 mag, (3) lacks any processing 
warning flags. As suggested by the middle and right column of plots, this definition of ’reliable’ still contains some small contamination (at the <1% level) from 
uncatalogued background sources and/or noise, as indicated by detections with distance residuals greater than ~ 1 arcsecond or magnitude residuals of greater 
than ~1 mag. In panel D, the less than 100% completeness at the bright end reflects the non-negligible probability that any asteroid will fall within 4” of a 
catalogued background source (regardless of the magnitude of either the asteroid or the background source). 


Table 1 

Description of the PTF asteroid database. Includes PTF data acquired from March 2009 through July 2014, excluding Ha survey data. 


table 

# rows 

PTF tiles 

11,169 

exposures 

304,982 

CCD images 

3,305,426 

asteroids 

401,810 

predicted sightings 

17,929,274 

positive detections 

8,842,305 

reliable detections^ 

4,392,395 

lightcurves^ 

587,466 

lightcurve fits^ 

54,296 

reliable-period fits"^ 

9,033 

reliable-Gi 2 hts^ 

3,902 


example columns (not necessarily comprehensive) 

R.A., Dec., tile ID 

epoch, filter, exposure time, absolute photometric zeropoint, tile ID, exposure ID 

CCD ID, comers RA & Dec, seeing, limiting mag., relative phot, zeropoint, # of sources, exposure ID, image ID 
name, orbital elements, color data (e.g., SDSS), IR data (e.g., WISE), known rotation period, asteroid ID (number) 
R.A., Dec., rates, helio- & geocentric range, phase & elong. angle, pred. V mag., image ID, asteroid ID, prediction ID 
R.A., Dec., instrumental mag., local zeropoint, shape data, quality flags, prediction ID, lightcurve ID, detection ID 
detection ID 

# of constituent detections, filter, opposition year, median mag., asteroid ID, lightcurve ID 

fitted lightcurve parameters, human-assigned quality code, machine-classified quality index, lightcurve ID, fit ID 

fit ID 

fit ID 


^ ‘Reliable’ detections are those free from possible background-source or bright star contamination, magnitude errors >0.1 mag, and certain SExtractor flags. 
^A lightcurve is here defined as a set of positive detections of a given asteroid in a single filter and opposition. 

^Lightcurve fits only exist for lightcurves which contain at least 20 reliable detections and converged to a solution during the lightcurve-fitting process. 

■^Fits have reliable rotation periods if a human screener labels the period reliable and the machine classifier rates it above a certain quality threshold (see text). 
®Fits have reliable G 12 phase-function parameter if (1) amplitude <0.1 mag or period is reliable, (2) fit has sufficient phase angle coverage (see section 6.3). 


alogued background sources within the 4" radius, (2) have 
a calibrated magnitude uncertainty of less than 0.1 mag, (3) 
lack any processing flags indicative of contamination. Figure 
2 compares predicted, positive and ‘reliable’ detections; the 
middle and right panels of Figure 2 show that our definition 
of ‘reliable’ seems to include a small fraction of likely bad ob¬ 
servations (<1% contamination, note the vertical log scale), 
namely those which have distance residuals greater than ^1" 
or magnitude residuals greater than ~1 mag. Because these 
reliable detections are the subset of observations which we 
input into our lightcurve fitting model (Section 4), the fit¬ 
ting algorithm includes logic designed to remove isolated data 
points that have very large residuals, either with respect to the 
median lightcurve value or relative to their uncertainty. 


3. LIGHTCURVE MODEL 

Equation (1) presents the overall form and notation of our 
asteroid lightcurve model. In this section we describe the de¬ 
tailed parameterization and assumptions of the model. 

3.1. Rotation component 
3.1.1. Intra-opposition constraint 

The most important parameter in the rotation component 
(the 6 in Equation [1]) is the synodic spin period P, a constant 
which satisfies 

S{t) = S{t nP), (2) 

where r = f — A/c is the light-time-corrected observation 
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timestamp, A = A(i) is the asteroid’s geocentric distance, c 
is the speed of light, and n is any integer satisfying 


\n\ < Poib/P, 

where Po^b is the synodic orbital period. 


(3) 


-Porb = 







27ra 


3/2 

orb 


, (4) 


where Torb is the asteroid’s sidereal orbital period and Oorb is 
its orbital semi-major axis (related by Kepler’s third law). Porb 
is the time elapsed between the asteroid’s consecutive opposi¬ 
tions. Pursuant to this restriction, we constrain each 6 solution 
using observations from within the same opposition— i.e., for 
most asteroids, within a 1.1- to 1.6-year interval centered on 
the date of locally minimally observed a. 

The intra-opposition restriction is important given that our 
data set (described in the next section) spans ~5 years. For 
an asteroid with a zero inclination circular orbit and spin axis 
perpendicular to its orbital plane, we can relax Equation (3) 
to allow n to be any integer, in which case <5 can be con¬ 
strained using observations spanning many years. In general 
however. Equation (2) must be modified to accommodate a 
varying viewing geometry with respect to the spin axis: 


6(t) = F(t)S(t + nP), (5) 

where F is some unknown periodic function satisfying 
F(t) = F{t + rriTotb), where m is any integer and Totb is 
the sidereal orbital period. Provided the amplitude of F is not 
large relative to that of 5, and provided the spin vector is not 
changing with respect to the orbital plane (i.e., precessing^) 
on a timescale comparable to we are justified in assum¬ 
ing Equation (2) (with the Equation [3] restriction) applies. 

3.1.2. Second-order Fourier series 

Any 5 satisfying Equation (2) can be approximated to ar¬ 
bitrary precision using a Eourier series. Harris et al. (2014) 
discuss why, from a geometric standpoint, the second har¬ 
monic tends to dominate an asteroid’s fitted 6. As noted ear¬ 
lier (section 1.1), most large asteroids approximately resem¬ 
ble triaxial prolate ellipsoids (e.g., Jacobi ellipsoids), hav¬ 
ing equatorial axis ratios of at most ~3:1 (corresponding 
to a (5niax — i^min amplitude of ^1.2 mag). Eor less ex¬ 
treme axis ratios (specifically, those producing a ~0.4 mag or 
smaller second-harmonic amplitude), other harmonics related 
to shape or albedo asymmetries may contribute comparable 
coefficients to the Fourier approximation of 6. 

The PTF survey program has—on a few rare occasions— 
conducted high-cadence (~ 10-minute spaced) observations 
of low ecliptic latitude fields. These runs produced a set of 
~ 1,000 densely-sampled main-belt asteroid rotation curves, 
which have already been analyzed and published (Polishook 
et al. 2012; Chang et al. 2014a). These high-cadence “pilot 
studies” are relevant to our present work in that they demon¬ 
strate (1) the quality of the PTF survey’s photometric cali¬ 
bration for asteroids with unambiguously valid 6 solutions, 

^ Principal-axis rotation (a stable equilibrium state) is assumed for most 
planetary bodies. Burns & Safronov (1973) discuss the relevant timescales of 
spin evolution. 


and (2) the above-described prevalence of a dominant second- 
harmonic in most of the objects sampled. 

Following these pilot studies, we adopt a second-order 
Fourier series model: 


^ - E^F.sin(^) 


M.k cos 


( 2 |.) 


, (6) 


where t is the light-time corrected epoch (cf. Equation [2]). 
In the pilot studies, most of the fitted 5 solutions qualitatively 
resemble a simple sine or cosine function. Such a solution can 
be represented by either a: 

1. first harmonic with period P = Pi (with i ^ 0 and 

= 0), or 

2. second harmonic of period P = jPi (with i = 0 
and Ai ^2 ^ 0). 

Given the prolate ellipsoid model, choice (2) is more realistic 
and hence preferred. However, again recognizing that other 
harmonics can have a non-negligible contribution, in fitting 5 
to our lightcurve sample we allow the first-harmonic coeffi¬ 
cients Ai l to be non-zero, but introduce logic into the fitting 
algorithm (cf. Section 4) which checks for double-period so¬ 
lutions satisfying certain criteria and iterates accordingly. 

3.2. Phase-function component 

In this work we simultaneously fit each lightcurve’s phase 
function f along with its rotation curve 6 (cf. Equation [1]). 
This approach is intermediate in complexity between some 
of the simpler, two-parameter (J-neglecting) models that have 
been applied to very large data sets (e.g., Williams 2012; Os- 
zkiewicz et al. 2012), and the more complex, shape plus pole- 
orientation models (Kaasalainen 2004; Cellino et al. 2009; 
Hanus & Durech 2012) which can involve tens of parameters 
and require data spanning multiple oppositions. 

Regarding the former class of models, we note that there is 
a formal statistical problem associated with neglecting S when 
fitting (j). If modeling the observations M hy V' = V — 6 = 
H-\-5 log]^Q(rA) —2.5 logig((/)), then the distribution of resid¬ 
uals M — V' is not Gaussian. Assuming 5 is a sinusoid with 
amplitude A, for observations M sampling the lightcurve at 
random times, the residual probability density function p = 
p(M—y') has a local minimum value pniin at M—E' = 0 and 
maximum value p^ax near M — V' = ±A. Thus p is bimodal 
and roughly bowl-shaped—not at all Gaussian-shaped. The 
uncertainty in f produced by a standard yf minimization— 
which assumes Gaussian-distributed errors—is thus inaccu¬ 
rate. However, since p is symmetric about M — V' = 0, for 
densely-sampled data the fitted phase function f remains un¬ 
affected by neglecting <5; in such a case the only effect is an 
underestimated uncertainty. 

We obtain three separate fits for each lightcurve, each using 
a different phase-function (cf) and allowing for unique solu¬ 
tions for F[ and <5 in Equation (1). The three phase-function 
models are: 


1. the two-parameter model of Shevchenko (1997), 

2. the one-parameter G model (Bowell et al. 1989), 

3. the one-parameter G 12 model (Muinonen et al. 2010). 

In this section we review and motivate the application of 
each of these f models. 
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3.2.1. Two-parameter Shevchenko model 

Shevchenko (1997) introduced a phase function dependent 
on two parameters; in terms of Equation (1) the model is® 

d 

-2.51ogio[(/)(a)] =/3a-C——, (7) 

1 + a 

where /3 has units of mag/deg and C is the amplitude of the 
opposition surge (units of mag). This model was subsequently 
considered in-depth by Belskaya & Shevchenko (2000), here¬ 
after B(&S, who compiled the most complete (to date) set of 
high-precision, targeted phase curve observations of main- 
belt asteroids from various data sets spanning several decades. 

Though in practice Shevchenko’s model is the least com¬ 
monly used phase function out of the three we consider, it 
is by far the simplest to express mathematically, and is the 
only model for 0 whose parameters have linear dependence 
in Equation (1). 

Eurthermore, this model’s parameters are the most straight¬ 
forward to associate with physical asteroid properties. B&S 
highlighted a robust relationship between an asteroid’s (/3, C) 
phase-function parameters and its geometric albedo^. As we 
later explore a similar relationship in the present work, we 
here review the basis of this observation. 

The geometric albedo pv is formally defined in terms of the 
phase function 0: 

= </>(«) sin(a) da^ ^ 

where Abond is the (visible) bond albedo, defined as the total 
visible light energy reflected or scattered by the asteroid (in all 
directions) divided by the total visible light energy incident 
upon the asteroid (from the Sun). We also here define the 
phase integral q. 

B&S showed that, in the range of /3 observed from S-type 
to C-type asteroids, /3 and C are empirically correlated, in a 
relation that we approximate here as 


C « (0.9 mag) —(17 deg)/3 for 0.03 < -r-— < 0.05. 

mag/deg 

(9) 

Using Equation (9) to substitute for C in Equation (7), insert¬ 
ing the result into Equation (8) and numerically evaluating the 
integral gives 


Pv 


A, 


bond 



2.2/3 \ 
mag/degy 


for 0.03 < -0- < 0.05. 

mag/deg 

( 10 ) 


B&S saw a negative correlation between pv and /3 in the 
data*, consistent with Equation (10) only if either Abond is 
assumed constant among different asteroid types (not a rea¬ 
sonable assumption) or if Abond negatively correlates with /3, 
which B&S did not explicitly show. 


® In Shevchenko’s original notation, /3 is denoted b and C is denoted a. 
Moreover, in the original notation, 0(0) = —a; we here added a constant 
term +a to make 0(0) = 1, following convention with other phase functions. 
^ Also known as the visible albedo or the physical albedo. 

* B&S actually stated the correlation in terms of logpv vs. /3, though the 
range in /3 is sufficiently small that py ''S- d is essentially valid as well. 


The bond albedo Abond can be thought of as an intrinsic, 
bulk-compositional characteristic of an asteroid’s surface®, 
much like an asteroid’s color, whereas /3 and C relate (in part) 
to the textural, particulate, and macroscopic roughness of the 
asteroid’s surface. B&S and other authors separately asso¬ 
ciate /3 with the shadow-hiding effect and C with the coher¬ 
ent backscatter effect. Both of these physical phenomena are 
understood from a theoretical standpoint {e.g., Helfenstein & 
Veverka 1989; Hapke 2012) to be functions of Abond, with j3 
negatively related to Abond and C positively related. This is 
consistent with Equation (9), and renders Equation (10) con¬ 
sistent with B&S’s noted py-vs.-fi correlation. Other proper¬ 
ties such as particle size, particle geometry and regolith poros¬ 
ity also have predicted (and laboratory-measured) contribu¬ 
tions to the observed phase function (Hapke 2012 and refs, 
therein); these properties can conceivably vary independently 
tlf Abond- 

In short, our interpretation of the S-type and C-type aster¬ 
oid data reviewed by B&S is that a compositional indicator 
(Abond) correlates with indicators of two independent phe¬ 
nomena (/3 and C) that contribute to how light scatters from 
an asteroid’s surface. This statement intentionally makes no 
mention of py, since Equation (8) tells us py by definition 
varies with /3 (in a non-obvious way) and with Abond, the lat¬ 
ter being a more basic compositional attribute. 

As stated above, the phase function can be related to proper¬ 
ties other than Abond, such as regolith porosity. Many of these 
other properties in theory and experiment contribute to effects 
involving multiply-scattered light, and therefore do not alter 
the effect of shadow-hiding (/3-term in Equation [7]), which 
is dominated by s/ngfy-scattered light (Hapke 2012). In con¬ 
trast, the coherent backscatter effect (C-term) does involve 
multiply-scattered light. B&S saw non-monotonic behavior 
in C as a function of py when including the rarer, high-py 
E-type asteroids in the same plot as C and S types. E types 
do conform however to the same negative monotonic trend 
in py-vs.-/3 satisfied by the C and S types, consistent with 
the hypothesis that /3 is adequately expressed as a function of 
Abond alone, yet E types have a lower-than-predicted C value 
based on extrapolation of Equation (9). 

One possibility is that Equation (9) is not valid for all aster¬ 
oids, but must be replaced by some unknown non-monotonic 
relationship, possibly because C depends non-monotonically 
on Abond and/or has comparable dependence on other prop¬ 
erties (e.g., porosity or grain size). Assuming Equation (7) 
is a sufficiently general model for 0, and lacking knowledge 
of a good model for C, it follows that /3 and C should in 
practice always be fit separately. Another possibility is that 
Equation (7) is an incorrect or incomplete model, however 
B&S described no instances wherein their model was unable 
to adequately fit the data for a particular asteroid or class of 
asteroids. 


3.2.2. Lumme-Bowell G model 

The next phase function model we consider is the Lumme- 
Bowell model (Bowell et al. 1989), also known as the (H,G) 


® More accurately, the single-scattering albedo w, which is the analog of 
Abond for a “point-source” particle, more fundamentally embodies this bulk- 
compositional attribute. Hapke (2012) details how Abend is solely a function 
of w for an asteroid whose surface consists of isotropic scatterers; we here 
use Abond as a proxy for w. 
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or lAU phase function; 


r (/. = (1 - G)<^i + G(/>2 

J = exp(—3.33tan° ®^[Q;/2]) (11) 

[ 02 = exp(—1.87tan^'^^[Q;/2]) 

Like Shevchenko’s model, this model includes two terms 
(the basis functions 0i and 02 ) representing two physically- 
distinct contributions to the observed 0. As detailed in Bow- 
ell et al. (1989), this model is sem/-empirical in that it was 
derived from basic principles of radiative transfer theory with 
certain assumptions, and at various stages tailored to match 
existing laboratory and astronomical observations. That the 
two basis functions’ coefficients are related to a single param¬ 
eter G bears resemblance to the /3-vs.-G correlation described 
by Equation (9). 

Marsden (1986) marked the lAU’s adoption of this phase 
function as a standard model for predicting an asteroid’s 
brightness. Since then this model has seen widespread appli¬ 
cation, and is often used with the assumption G = 0.15 (e.g., 
in the ephemeris computation services offered by the MFC 
and JPL). Harris & Young (1988) present mean values of G 
for several of the major asteroid taxonomic classes (based on 
a sample of ~80 asteroids), with G = 0.15 being an average 
between the C types (G « 0.08) and the S types (G « 0.23). 
The G-model fails to accurately fit the rarer D types (which 
have linear phase curves) and E types (which have very sharp 
opposition spikes), whereas the Shevchenko model can prop¬ 
erly accommodate these rarer types. 

Use of the Lumme-Bowell 0 in our lightcurve model (Equa¬ 
tion [1]) introduces a second non-linear parameter (G) into 
the model, the period F being the other non-linear parameter. 
This complicates the fitting algorithm somewhat, as described 
in Section 4. 


In this work we use this single-parameter G 12 form of the 
Muinonen et al. model, making it analogous to the G-model 
in terms of implementation, including the complication asso¬ 
ciated with a non-linear parameter. 

3.2.4. Multi-parameter Hapke model 

Just as we commented on the more rigorous means of 
fitting a rotation curve via 3D shape modeling with multi¬ 
opposition data, for completeness we note that a more rig¬ 
orous model (than the three presented above) exists for phase 
functions. Given better-sampled lightcurves and more compu¬ 
tational power, future modeling of large photometric datasets 
would benefit from applying the more theoretically-motivated 
model of Hapke (2012), an abbreviated form of which is 



(^^{Bsg- 1) + 



h -\- 



(14) 


Here w is the single-scattering albedo (cf. Footnote 9), of 
which tq is solely a function. The remaining factors all are 
functions of phase angle (a). Each opposition-surge term (Bg 
and Be) has two free parameters (width and amplitude). K 
depends on the mean topographic roughness (a function of 
one free parameter); g is the single-scattering angular dis¬ 
tribution function (typically includes one parameter); /i is a 
function of a only; and 0 l is the phase function of an ideal 
Lambertian-scattering sphere (a simple function of a). 

With its (j) (X py^ dependence, the Hapke model (Equa¬ 
tion [14]) can conveniently eliminate both pv and H from the 
modeling process. Inserting Equation (14) into Equation (1), 
and using the common relation^® 


H =-5 logio 


^1329kmy ’ 


(15) 


3.2.3. Muinonen et al. G \2 model 

The third phase function model we consider, introduced by 
Muinonen et al. (2010), bears resemblance to the G-model but 
includes a second free parameter and a third basis function: 


where H is the absolute visual magnitude, D is the aster¬ 
oid’s effective diameter and 1329 km is a constant (set by the 
arbitrarily-defined magnitude of the Sun), produces a model 
with many physically meaningful parameters and free of both 
H and py. 


0 = Gi0i -f G202 -f (1 — Gi — G2)03 (12) 

As opposed to the analytic trigonometric basis functions of 
the G-model, here 0i, 02 and 03 (all functions of a alone) are 
defined in terms of cubic splines (see Muinonen et al. 2010 for 
the exact numerical definitions). Assuming the coefficients 
Gi and G 2 are constrained independently, these basis func¬ 
tions were designed to provide the most accurate fits to the 
phase functions of all major asteroid taxonomic types, includ¬ 
ing the rarer D types and E types. 

Eor situations where fitting Gi and G 2 separately is infea¬ 
sible, Muinonen et al. (2010) specialized their above model to 
make it a function of a single parameter, G 12 , which parame¬ 
terizes Gi and G 2 using piecewise functions: 

_ r 0.7527Gi2 +0.06164 ifGi2<0.2; 

\ 0.9529Gi 2 + 0.02162 otherwise; 

(13) 

- 0.9612Gi 2 + 0.6270 ifGi2<0.2; 

- 0.6125Gi2 + 0.5572 otherwise; 


4. LIGHTCURVE-FITTING ALGORITHM 

We solve Equation (1) using a custom linear least squares 
(LLSq) method. A basic review of LLSq can be found in 
Hogg et al. (2010). Each fitted asteroid lightcurve contains 
A’obs + 20 observations, with measured apparent magni¬ 
tudes rrii and measurement uncertainties ai. All instrumental 
magnitudes are elliptical aperture (Kron 1980) measurements 
(SExtractor’s MAG_AUTO) calibrated with a local zeropoint 
{i.e., the ‘ZPVM’ correction of Ofek et al. 2012a). The un¬ 
certainties contain a Poisson-noise component (SExtractor’s 
MAGERR_AUTO) as well as systematic error from the calibra¬ 
tion. For images lacking a relative photometric solution, the 
relevant systematic error is the APRS RMS parameter in the 

Rather that attributing it to any specific author(s), we note that Equation 
(15) may be derived directly using Equation (8) and the following definition 
of the bond albedo, which we stated in words immediately after Equation (8): 

^ (10-"sun/2.5/4^AU2) X 7r(D/2)2 

where V (o;) = H — 2.5 log^^Q 4>{a) is Equation (1) evaluated at 5 = 0 and 
r = A = 1 AU. 
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PTF database; for images having a relative photometric solu¬ 
tion, the systematic error is a combination of the sysErr and 
zeroPointErr database quantities (added in quadrature). 

In all cases, our model (Equation [1]) is non-linear in at 
least one parameter (the period P, or equivalently the fre¬ 
quency / = 1/P). We test evenly-spaced frequencies 
between / = 0 (infinite rotation period) and / = 12 day“^, 
Le., up to the ~2-hour spin barrier. 

Asteroids rotating faster than the ~2-hour spin barrier 
are likely monolithic objects and—^particularly if larger than 
~150 m—are interesting in their own right (cf. the discussion 
in Pravec et al. 2002). However, given the apparent observed 
rarity of such super-fast rotators (SFRs) and the large interval 
in frequency space that must be searched to discover them; 
we impose 2 hours = 12 cycles per day as our upper limit 
on fitted frequency in order to make computational time rea¬ 
sonable without sacrificing sensitivity to the majority of as¬ 
teroids’ spin rates. Chang et al. (2014a) presents preliminary 
results of an independent, ongoing effort to use PTF data (or 
at least specific subsets thereof) to search for SFRs, with at 
least one SFR having been discovered and confirmed (Chang 
et al. 2014b). 

We use a frequency spacing A/ = l/(4Af), where At 
is the time interval between the first and last observation in 
the lightcurve. Formally At can be as long as 1.1 to 1.6 yr 
for most asteroids (cf. Section 3.1.1); however the median 
value of At (among lightcurves that ultimately acquired fits) 
is ~45 days, with 16* and 84* percentiles of 13 and 106 days, 
respectively. 

In addition to the non-linear parameter /, the lightcurve 
model in general has linear parameters. We seek to solve 
the following tensor equation for X: 


the linear-parameter matrix X in this case is 




/ \ 

(^ 14 ). 

2)j 

1^2,2)j 


V 




J 


(18) 


where Hj is the fitted absolute magnitude for the j* fre¬ 
quency, etc. 

The general FFSq solution to Equation (16) is 


^ ^ )np'^p^ (19) 

£,n^p 

where B~^ is the inverse of the data-covariance matrix B: 


/a? 0 ••• 0 \ 

0 cri ••• 0 

V 0 0 • • • / 


( 20 ) 


and Sjke is the parameter-covariance matrix, given by 


Sjki = [(sj) (21) 

where in the above definition we invert each of the ma¬ 
trices Sj, these being defined by 


j * = 1; 2, ..., Nohs 

TTli — ^ ^ LijkXjk \ J = 1, 2, ..., Afi-q (16) 

j,k [ k = 1,2, ..., iViin 

where rrii is the i* observation, L is the ‘design matrix’ (a 3D 
array of size TVobs x iV^q x Nun) and X is the linear-parameter 
matrix (W^q x Ajin) containing the linear-parameter solutions 
as a function of frequency. 


4.1. Linear phase-function parameters 

For the particular case wherein we use Shevchenko’s model 
(Equation [7]) for the phase function f, the design matrix is 


Lij — 


( ^ \ 

sin(27r/jTi) 

COs{2tt fjTi) 

sin(47r/j-Ti) 
cos(47r/,Ti) 

Oil 

V Q!i/(1 -|- Oli) ) 


(17) 


where the fc-index has been omitted with the convention that 
fc = 1 is the 1®' row of the above column vector, fc = 2 is the 
second row, etc. Here and ai are the time and phase angle 
of the i* observation, fj is the /* frequency, etc. Fikewise, 


^ ^ Lnjkjk^ )npLpji' ( 22 ) 

n,p 

The elements of the parameter-covariance matrix S are the 
variances and covariances of the fitted parameters (as a func¬ 
tion of frequency). The fit’s residuals (as a function of fre¬ 
quency) are: 


Rij — rni ^ L 


/ . ^ijkXjk, 
k 


(23) 


and the fit’s chi-squared (as a function of frequency) is: 


(24) 

The frequency-dependent chi-squared (x^)j is also known 
as the periodogram. Formally, the best-fit rotation frequency 
corresponds to the minimal value of (x^)j, but this may dif¬ 
fer from the preferred frequency solution if the lightcurve 
is contaminated by other systematic periodic signals, if the 
data suffer from underestimated measurement uncertainties, 
or if the best-fit frequency corresponds to a dominant first har¬ 
monic (as opposed to a preferred dominant second harmonic, 
cf. Section 3.1.2). 

Figure 3 details our iterative lightcurve-fitting algorithm’s 
logic. Fitting commences as long as 20 or more ‘reliable’ 
data points (cf. Section 2.2 and Figure 2) are associated with 
a lightcurve. Irrevocably-bad data points are discarded in the 
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Repeat for all 
asteroids 


Repeat for all PTF tile-CCD-filter combinations 
containing a predicted sighting 



group detections by unique \ 
asteroid, opposition year, and ) 
photometric band ^ 


predicted sightings 

precise R.A., Dec, rates, helio- & geo-centric ranges, 
phase & elongation angles, predicted V mag 


known rotation periods 


lightcurve fits 

from Warner et al. 2009 


fitted rotation and phase 

Lightcurve Database, quality code 3 


function parameters 


_\ 

/ 


lightcurves 


N 




Figure 3. Diagram detailing the logic of this work’s data reduction and analysis. Includes mining the survey for known-asteroid observations, aggregation of the 
data into lightcurves, vetting of the lightcurves and an application wherein phase functions are compared to color-derived asteroid taxonomy. See text for details. 
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first round of iterations, these include detections with la ox 
greater residuals from the initial solution. Examples of de¬ 
tections with such high residuals include contamination from 
background sources missing in the reference catalog, bad de¬ 
tector pixels that were not flagged by the pipeline, or spurious 
zeropoint solutions. 

In the next stage of iterations, the fit’s per degree of free¬ 
dom is reduced to ~1 (formally, it is reduced until it is less 
than 3, cf. Figure 3) by gradually inflating the observations’ 
errorbars through addition of a ‘cosmic error’, so-named be¬ 
cause it encompasses contamination from possible errors (in 
all the ‘cosmos’). In general the cosmic error represents the 
same diverse contaminating phenomena responsible for the 
>la deviations seen in the initial iterations (cf. previous para¬ 
graph) just to a lesser extent. 

Separately, this errorbar inflation compensates for our 
model’s inability to fit each asteroid’s precise periodic struc¬ 
ture using only two harmonic terms in the Fourier series. In 
the limit of infinite observations and sufficiently many Fourier 
terms, we would ideally expect our data’s errorbars to reflect 
true Gaussian variance. However, by truncating the series 
at two harmonics and using sufficiently precisely-calibrated 
photometry, we are in effect choosing to sacrifice (downsam¬ 
ple) some of our photometric precision to obtain a formally 
better fit at the coarser resolution limit of the model. 

To illustrate use of the cosmic error, consider the example 
of an eclipsing binary lightcurve, i.e., a rotation curve which 
is effectively sinusoidal except for a small interval around the 
phase of minimum flux, when it dips to a lower-than-predicted 
brightness. Examples from our dataset appear in Figure 10. 
Observations acquired during such eclipses will have system¬ 
atic negative deviations greater in absolute value than would 
be explained by Gaussian variance alone. Increasing the er¬ 
rorbars of these observations will decrease the fits’ without 
altering the value of the fitted frequency. The fitted parame¬ 
ters’ uncertainties (both for frequency and the linear parame¬ 
ters) are accordingly inflated as a penalty, and the fitted am¬ 
plitude will be underestimated. As detailed in Figure 3, the 
initial cosmic error used is 0.002 mag, and each iteration it 
is multiplied by a factor 1.5 until the y^ is sufficiently low. 
If the cosmic error exceeds 0.1 mag, the fitting is aborted. If 
the y^ (per degree of freedom) drops below 3 while the cos¬ 
mic error is still below 0.1 mag, the fitting process concludes 
‘successfully’ (see Figure 3). 

Concurrently, each iteration includes a test for the presence 
of double peaks in the folded rotation curve (only if the fitted 
amplitude is at least 0.1 mag). In particular, if there exist two 
maxima and two minima in the folded lightcurve, we demand 
that the ratio of these peaks be greater than 0.2. Such a solu¬ 
tion is preferred (cf. Section 3.1.2) given our ellipsoidal shape 
assumption, as described by Harris et al. (2014). 

Denote as /best_giobai the frequency yielding the absolute 
minimum y^ per degree of freedom value, denoted 
(after the cosmic error has been tuned). If the folded 
lightcurve is single-peaked (or has only a relatively small sec¬ 
ondary peak), then another deep minimum usually exists at 
the harmonic frequency /best_harmonic = 0.5 x /best.giobai, the lo¬ 
cal minimum y^ value of which we denote x^in harmonic)- f^°r 

cases wherein xL_harmonic < xL.giobai+ inv-X^-cdf(0.95, 7), 

where inv-x^-cdf(p, N) is the inverse of the y^ cumulative 
distribution function for N free parameters evaluated at p, 
then we instead choose /bestjiaimonic rather than /best.giobai- 
The Icr uncertainty interval for the best-fit frequency is then 


found by computing the upper and lower intersections be¬ 
tween Xmin + inv-x^-cdf(0.68, 7) and the periodogram in the 
vicinity of /best- Note that we used N = 7 free parameters in 
this case, i.e., the number of elements of Xj (Equation 18). 

4.2. Nonlinear phase-function parameters 

Modeling the phase function f with either the G or G 12 
model (Equations [11] and [12]), introduces a second non¬ 
linear parameter (after the frequency /) and so we must mod¬ 
ify the equations of the previous section accordingly. We sam¬ 
ple A^pha = 200 evenly-spaced phase-function parameter val¬ 
ues. In particular, for G we test the interval —0.3 < G < 0.7 
in steps of AG = 0.005, and for G 12 we test the interval 
0 < Gi 2 < 1 in steps of AG 12 = 0.005. 

Our approach is to modify the left-hand side of Equation 
(16) by defining a new matrix which contains all possible 
phase-function-corrected observed magnitudes: 


j * = 1)2, Nohs 

'<^iq = trii — ^iq = LijkXjkq j 2^ ' 

where, e.g., for the case of the G-model (Equation [11]), 

= -2.5 logio[(/>(ai, G,)] 

= -2.51ogio[(l - Gq)(j)i{ai) -f Gg/ 2 (ai)] 

The linear-parameter-solution array X now has an extra in¬ 
dex q, reflecting the fact that we are now solving for each lin¬ 
ear parameter as a function of the two non-linear parameters. 
The design matrix has the same number of indices as before 
(but fewer rows): 


Lij — 


( ^ \ 

sin(27r/jTi) 

cos(27r/,Tj) , 
sm{4TTfjTi) 

\ cos(47r/, Tj) / 


while the linear-parameter matrix X is now 


(27) 


( Hjq \ 

{M,l)]q 

{M,2)jq 

V i^2,2)jq J 


(28) 


The appeal in adopting the above approach is that the gen¬ 
eral solution is only slightly modified: 


Xjkq — E ')npXtlpq, (29) 

t.n.p 

where the only difference between equations (19) and (29) 
are the q indices appended to X and m (and the latter being 
redefined as m'). 

The fit’s residuals R are now a function of frequency and 
phase-function parameter: 


Rijq — 


Rijk^jkqj 


(30) 
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Figure 4. Examples of lightcurves having both well-sampled rotation and phase-function components. Each row con'esponds to a different asteroid. These 
example asteroids are sorted vertically by their physical diameter (assuming 7% albedo); the top object is ~45 km and the bottom object is ~2 km. Column A 
shows the phase curve (corrected for rotation); Column B shows the rotation curve (corrected for phase-function); Column C shows the periodogram; Column D 
shows the distribution of the observations in rotational phase vs. solar phase angle. Above each plot is additional information depending on the column: (A) the 
asteroid number, followed by (in square brackets) the opposition year (most are 2013) and filter (in all cases ‘r’) followed by the fitted G 12 parameter; (B) the 
fitted absolute magnitude and amplitude; (C) the fitted period (in hours); (D) the number of data points included (and shown) in the fit. 
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as is the fit’s chi-squared; 

(X )iq — RijqjB )inRnjq- (31) 

i,n 

As a function of any of the linear parameters, the fit’s 
varies precisely quadratically, whereas as a function of fre¬ 
quency it has an intricate spectral structure with many local 
minima. As a function of a non-linear phase parameter (G or 
G' 12 ), the tends to have a single minimum (on the range 
we evaluate): in this sense G and G 12 are more similar to 
the linear parameters than they are to frequency. However, 
the generally asymmetric shape of the phase parameter’ 
dependence necessitates its grid-based numerical treatment— 
particularly to ensure accurate estimation of the phase param¬ 
eter’s uncertainty. 

The two-dimensional x^ surface given by Equation (31), 
which is defined on a A^freq x A^pha grid, can be reduced to a 
one-dimensional x^ function by choosing, for each frequency 
index j, the phase-parameter index q that minimizes the x^- 
The result is a one-dimensional periodogram, as in Equation 
(24). Once the fitted frequency is identified, we compute the 
uncertainty in the fitted / by the method described in the pre¬ 
vious section using the inv-x^-cdf() function. We then like¬ 
wise numerically compute the uncertainty in the phase param¬ 
eter by again collapsing {x^)jq to a one-dimensional vector, 
this time as a function of the phase parameter with the fre¬ 
quency fixed at the fitted value (j-index), and use the inv-x^- 
cdf() function to estimate the uncertainty in the phase param¬ 
eter. 

As noted in Table 1, a total of 587,466 lightcurves exist in 
PTE, where each lightcurve by definition consists of all re¬ 
liable observations of a unique asteroid observed in a single 
opposition in a single photometric band. Of these, only ~10% 
(59,072 lightcurves) have at least 20 observations and there¬ 
fore qualified for fitting with our algorithm. A total of 54,296 
lightcurves actually produced a fit—the remaining ^5,000 
lightcurves failed to produce a fit either because some obser¬ 
vations were discarded and the total fell below 20 data points, 
or because the fitted cosmic error grew to exceed 0.1 mag. 

Eigure 4 shows several examples of lightcurves fitted with 
the algorithm described in this section. In the third column 
(column C) of Eigure 4, we show the periodograms of each 
lightcurve. Note that although the periodogram’s horizontal 
axes are labeled with the period (for easier interpretation), 
the chi-squared (per degree of freedom) values are actually 
plotted linearly with respect to frequency. This is because, as 
described earlier, our sampling is uniform with respect to fre¬ 
quency, and the harmonics are more easily seen with constant 
frequency spacing. Column (D) shows the data sampling in 
rotational phase versus solar phase angle, a useful plot to en¬ 
sure there is no obvious correlation between the two (which 
could lead to an erroneous fit, e.g., for long periods, large am¬ 
plitudes and/or few data points). 

4.3. Comments on implementation 

Each iteration in the fitting of each asteroid lightcurve in¬ 
volves evaluating the arrays and tensor-products in either 
Equation (19) or (29). This includes inverting the data- 
covariance matrix B (Equation [20]) and inverting the A(fiq 
matrices Sj (Equation [22]). The arrays L, m!, X and R can 
have a relatively large number of elements, making them and 
their relevant products potentially taxing with respect to com- 



Figure 5. For the 927 lightcurves (805 unique asteroids) having a quality 
code 3 period in the Lightcurve Database of Warner et al. 2009 and an orig¬ 
inal fit in this work, we plot the distribution of the relative error in our fitted 
rotation frequencies with respect to the literature-referenced frequencies. The 
distribution is bimodal, with the left-hand mode corresponding to those fits 
having better than ~3% agreement. 

putational memory. 

Our particular implementation of this algorithm leverages 
the efficient array-manipulation capabilities of MATLAB, es¬ 
pecially its ability to perform fast matrix multiplication and 
matrix inversion utilizing BLAS calls" and OpenMP multi¬ 
threaded C loop code". Given typical numbers of observa¬ 
tions and frequency sampling, each of our lightcurve fits (in¬ 
cluding the multiple iterations) takes on average several tens 
of seconds to run on an eight-core machine (multi-threading 
enabled), and typically consumes less than ^4 GB of memory 
using single-precision computation. 

In the online supplementary material we provide our cus¬ 
tom MATLAB function used for fitting the G-parameter ver¬ 
sion of the lightcurve model (asteroid_lc_fit_G.m). 
Analogous versions exist for the Shevchenko and G 12 mod¬ 
els. This function takes as input an asteroid’s apparent magni¬ 
tudes, magnitude uncertainties, observed epochs, phase an¬ 
gles, geocentric and heliocentric distances. Its outputs in¬ 
clude the linear-parameter-solution array (Equation 28), resid¬ 
uals (Equation 30), chi-squared array (Equation 31), and addi¬ 
tional information about each lightcurve solution such as the 
amplitude and peak ratios. 

5. RELIABILITY OF FITTED ROTATION PERIODS 

A primary concern in the quality assessment of our fitted 
lightcurve parameters is the validity of our derived rotation 
periods. In this section we describe several methods of esti¬ 
mating the reliability of these periods, beginning with com¬ 
parison to a ground-truth subsample of known-period aster¬ 
oids and followed by a full vetting of our entire sample using 
a combination of machine-learning and manual classification. 

The fitted period may differ (slightly or significantly) be¬ 
tween the fits using the different phase function models. In 
this section for simplicity we consider only the period value 
obtained when fitting with the G 12 phase-function model 
(Section 3.2.3). In subsequent sections we will again consider 
all three f models. 

5.1. Known-period subsample 

A total of 927 (~2%) of our fitted lightcurves belong to 805 
unique asteroids having a previously-measured period listed 

*' http://www.netlib.orgA)las 
http://openmp.org 
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Figure 6. Examples of lightcurves whose htted frequency differs from the reference frequency by more than 3%, so that they fall in the right mode in the 
histogram shown in Figure 5 and are formally dehned as inaccurate fits. Row 1: Low-amplitude rotator. Row 2: Incorrect period (too few observations?). Row 
3'. A fitted frequency that differs from the reference frequency by 12%. Row 4'. period that differs by a non-integer multiple, despite looking reasonable. Row 5: 
Folded lightcurve appears to be fitting noise in the data. 


in the Lightcurve Database (LCBD) of Warner et al. (2009). 
This includes only asteroids having a quality code of 3 (high¬ 
est quality) in the LCDB. 

Figure 5 shows that the distribution of relative errors on our 
htted frequencies is bimodal, with the left mode correspond¬ 
ing to periods having better than ^3% agreement with the 
reference period, and the right mode corresponding to periods 
in disagreement with the reference period. These disagreeing 
hts include lightcurves which differ from the reference value 
by a harmonic (half = relative error 0.5, double = relative er¬ 
ror 1.0), as well as frequencies that do not differ by a factor 
of two or any integer multiple. About 1/3 of the lightcurves 
in Figure 5 fall into the right mode and are thus considered 
disagreeing hts. 

Figure 6 shows some examples of these disagreeing hts. 
Row 1 shows an apparent low-amplitude rotator, whose htted 
period of 15.7 hr differs from the reference value of 9.7 hr. 
Row 2 is an object whose periodogram contains a great deal 
of noise, divided into two broad forests of frequency minima. 
The left forest appears to have been selected by our htting al¬ 
gorithm while the right forest seems associated with the true 


period of ^2.7 hr. Row 3 contains an object whose 12% rel¬ 
ative frequency error exceeds the 3%-accuracy threshold we 
have dehned, and so despite appearing to be a good ht it is 
formally categorized as inaccurate. Row 4 also looks like a 
reasonable ht at 6.4 hr, but disagrees with the reference period 
of 11.0 hr (though the latter does have a perceptible local min¬ 
imum in the periodogram). Finally, Row 5 includes a likely 
example of the algorithm htting noise in the photometry of a 
faint asteroid. 

In Figures 7 and 8 (top and middle rows) we de¬ 
tail the distribution of the accurately-recovered-period and 
inaccurately-recovered-period subgroups in terms of eight 
different lightcurve parameters. Some basic observations 
from these histograms are: 

1. htted periods are far less reliable if longer than ^1 day 
or shorter than ^2.7 hours, 

2. htted amplitudes of less than 0.1 mag correspond to the 
least reliably ht periods, 

3. lightcurves consisting of observations dimmer than 
^18.5 mag are much less reliable than brighter 
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Figure 7. Top row: The 927-lightcurve known-period sample (black), divided into the accurately-fitted (green) and inaccurately-fitted (red) subgroups. Middle 
row: Ratio of the green to black histograms. Bottom row: Results of cross-validation of the machine-classifier (see Section 5.2.2). 


lightcurves (though they are also far less numerous in 
the known-period sample), 

4. fit (per degree of freedom) values of less than ^1.7 
correlate with less reliable periods (though they are also 
far less numerous in the known-period sample). Note 
that, in the fitting process, growth of the cosmic error 
term ceased once the (P^r degree of freedom) fell 
below 3 (cf. Figure 3). 

5. the number of observations in a lightcurve is not di¬ 
rectly correlated to the reliability of the fitted period, 

6. the ratio of the folded lightcurve’s two peaks, the 
signal-to-noise ratio of the periodogram’s chosen min¬ 
imum, and the uncertainty in the absolute magnitude 
parameter are all strong indicators of the reliability of 
the fitted period. 

The above comments reflect consideration of the one¬ 
dimensional distributions in Figure 7 and 8; however we can 
easily imagine there are correlations in more dimensions not 
evident from these plots alone. An obvious example would be 
the two-dimensional distribution in amplitude versus median 
magnitude; reliability is presumably greater for bright aster¬ 
oids having amplitudes <0.1 mag than it is for dim asteroids 
having amplitudes <0.1 mag. Period versus amplitude is also 
likely an insightful distribution (and was considered for ex¬ 
ample by Masiero et al. 2009). The number of observations 
possibly does correlate with reliability if we were to restrict 
another parameter or parameters to some specific interval. 


Rather than manually examining the period-htting reliabil¬ 
ity as a function of all possible multi-dimensional combina¬ 
tions of the eight lightcurve parameters detailed in Figures 7 
and 8, we can take a more general approach of considering 
the reliability to be a single function dehned on the multi¬ 
dimensional parameter space in which all of the lightcurves 
reside. We hypothesize that accurately-fit lightcurves and 
inaccurately-fit lightcurves occupy distinct regions in this 
multi-dimensional volume. As these volumes can overlap 
to some extent, we can at least estimate the probability that 
a lightcurve with that particular vector of parameters corre¬ 
sponds to an accurately-recovered (or inaccurately-recovered) 
period when obtained by the fitting algorithm of Section 4. 

There are two general ways of accomplishing this goal. One 
way is to produce a large number of synthetic lightcurves 
filling out the multidimensional lightcurve-parameter space, 
subject these synthetic lightcurves to our fitting algorithm, 
and thereby map out e.g., by binning and interpolation, the 
fit reliability throughout the multi-dimensional volume. This 
method requires us to accurately simulate all sorts of vary¬ 
ing sampling cadence as well as measurement uncertainties, 
including contributions from both systematics and noise, and 
it requires significant extra computing time to actually sub¬ 
ject the synthetic data to our htting procedure. The second 
method—the approach we take in this work—uses a ground- 
truth sample (the known-period lightcurves already described 
in this section) to train a machine classifier to discriminate 
reliable versus unreliable fits within the multi-dimensional 
lightcurve-parameter space. 
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Figure 8. Top row. The 927-lightcurve known-period sample (black), divided into the accurately-fitted (green) and inaccurately-fitted (red) subgroups. Middle 
row. Ratio of the green to black histograms. Bottom row. Results of cross-validation of the machine-classifier (see Section 5.2.2). 


5.2. Machine learning 

We adopt a supervised ensemble-method approach for clas¬ 
sification, originally popularized by Breiman et al. (1984), 
specifically the random forest (RF) method (Breiman 2001). 
RF classification has extensive and diverse applications in 
many fields (e.g., economics, bioinformatics, sociology). 
Within astronomy in particular RF classification is one of the 
more widely-employed methods of machine-learning, though 
many alternatives exist. For example, Masci et al. (2014) 
use the RF method for variable-star lightcurve classification, 
while others have approached this problem via the use of, 
e.g., support vector machines (Wozniak et al. 2004), Kohonen 
self-organizing maps (Brett et al. 2004), Bayesian networks 
and mixture-models (Mahabal et al. 2008), principle compo¬ 
nent analysis (Deb & Singh 2009), multivariate Bayesian and 
Gaussian mixture models (Blomme et al. 2011), and thick-pen 
transform methods (Park et al. 2013). 

For general descriptions of RF training and classification, 
we refer the reader to Breiman (2001), Breiman & Cutler 
(2004), and the many references cited by Masci et al. (2014). 
Our use of a RF classifier is particularly motivated by its 
already-proven application to the discovery and classifica¬ 
tion of astrophysical transients in the same PTF survey data 
(Bloom et al. 2012), as well as streaking near-Earth asteroid 
discovery in PTF data (Waszczak et al. in prep.). 

Machine-learning application generally consists of three 
stages; training, cross-validation, and classification. In the 
training stage of building a machine classifier, the multi¬ 
dimensional parameter space is hierarchically divided into 
subspaces called nodes, these nodes collectively comprise a 


decision tree. The smallest node—also known as a leaf — 
is simply an individual datapoint (in our case, a single 
lightcurve). Given a set of leaves with class labels, one can 
build an ensemble of trees (called a forest), each tree repre¬ 
senting a unique partitioning of the feature space, wherein 
the nodes are split with respect to different randomly-chosen 
subsets of the parameter list. Each node splitting attempts 
to maximize the separation of classes between the sub-nodes. 
Serving as a model, in the subsequent classification stage the 
forest allows one to assign a probability that a given vector of 
features belongs to a given class. During cross validation (an 
essential early stage in this process), the training and classi¬ 
fication steps are repeated many times, each time using dif¬ 
ferent subsamples (of labeled data) as the training data and 
testing data. Cross validation evaluates the classifier’s perfor¬ 
mance and ensures it is not overfitting the training data. 

Eor our lightcurves, we are interested in a binary classifi¬ 
cation, i.e., whether the fitted period is accurate ^real’) or 
inaccurate (‘bogus’). Bloom et al. (2012) coined the term 
realBogus to describe this binary classification probabil¬ 
ity in the context of extragalactic transient identification. In 
the present work we are essentially adapting Bloom et al.’s 
realBogus concept to the problem of lightcurve-period re¬ 
liability assessment. 

We employ a MATLAB-based Random Eorest classifier'^ 
which is a port of the original RE software (originally im¬ 
plemented in R). This software includes two main functions, 
which perform the training and classification steps separately. 

https://code.google.eom/p/randomforest-matlab 
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Table 2 

Summary of the 20 lightcurve parameters (features) used by our period-quality classifier. See text for a discussion of the cross-validation-derived importance 

value (Section 5.2.2). 


feature 

importa¬ 
nce (%) 

peakRatio 

11.1 

amplitude 

10.2 

periodFit 

8.6 

freqSNR 

8.4 

hMagErr 

5.8 

al2Coeff 

4.3 

a22Coeff 

4.2 

numObsFit 

4.1 

medMag 

4.1 

chisq 

4.1 

a21Coeff 

4.0 

allCoeff 

3.8 

rmsFit 

3.8 

hMagRef 

3.8 

kindex 

3.7 

freqResol 

3.7 

hMagResid 

3.7 

cuspindex 

3.6 

numObsRem 

2.9 

cosmicErr 

2.1 


description 

Ratio of the fitted lightcurve’s two peaks (= max — min). Zero if only one peak, one if exactly the same height. 

Fitted amplitude of the folded lightcurve. Equivalent to the height (max — min) of the larger of the two peaks. 

Rotation period value obtained using this work’s data and fitting algorithm. 

Signal-to-noise of the fitted (minimum) frequency in periodogram = 2 x |min — median|/(84^^-percentile — 16^^-percentile) 
Uncertainty in the fitted -magnitude {i.e., error in the fitted absolute magnitude) 

Fourier coefficient A 12 
Fourier coefficient A 22 

Number of observations in the final fitted lightcurve, after discarding any bad observations 
Median calibrated magnitude (in the photometric band specific to the lightcurve, either R or g) 

Reduced chi-squared of the fit i.e., per degree of freedom) 

Fourier coefficient A 21 
Fourier coefficient An 
Root-mean-squared residual of the fit 

Reference -magnitude {i.e., absolute magnitude of the asteroid in U-band as listed by the MFC) 

Stetson’s K-index (a measure of kurtosis in the magnitude distribution of a folded lightcurve, introduced by Stetson 1996.) 
Resolution of the periodogram: A/ = 1/ (4Af) where At is the time between the first and last observations in the lightcurve 
Difference between the reference absolute magnitude (hMagRef) and the fitted if-magnitude 

‘Cusp index’: Median squared residual of the dimmest 10% points divided by the median squared residual of all other points 
Number of observations removed during the fitting process (due to >7-sigma residuals with respect to preliminary fits) 

Final ‘cosmic error’ value at end of fitting process (<0.1 mag in all cases) 
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Figure 9. Correlation matrices (Spearman’s p coefficient) for the 20 lightcurve features (Table 2) in the training sample (left) and in the full data set (right). 


5.2.1. Classifier training 

Our training data consist of the known-period lightcurves 
(cf. the previous section) belonging to the two classes under 
consideration: 618 lightcurves having accurately-fit rotation 
periods and 309 lightcurves having inaccurately-fit periods. 
Membership in one class versus the other depends on our ar¬ 
guably arbitrary 3% relative accuracy threshold, though we 
claim the clearly bimodal shape of the distribution in Figure 
5 justifies this 3% criterion. We note also that the classifier 


ultimately only provides a probability that a given lightcurve 
belongs to one class or the other, so that objects very near to 
the 3% cutoff may conceivably correspond to classification 
probabilities close to 0.5. 

An important point is that the ‘ground-truth’ reference pe¬ 
riods we have taken from the database of Warner et al. (2009) 
may include some number of inaccurate periods. Such pe¬ 
riods may be the product of erroneous fitting on the part of 
any one of its many different contributors, each of whom may 
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Figure 10. Examples of reliable lightcurves whose folded rotation curve include cusp-like minima (systematic negative deviation from the 2nd-order Fourier ht 
at minimum brightness), suggestive of a binary system. Many more examples exist in our lightcurves, however in this work we have not specihcally flagged such 
lightcurves. Future works will more carefully label and analyze this particular class of objects. 


employ a different fitting procedure and/or adhere to different 
confidence criteria. For the sake of this work however we con¬ 
sider all quality code 3 periods to be accurate—any actual in¬ 
accuracy will contribute to decreased classifier performance. 

Besides ground-truth periods that are simply inaccurate, we 
also in principle risk contamination from reference periods 
that are no-longer accurate. We assume that the majority of 
asteroids’ periods are not changing with time, at least not at 
levels measureable with our data. For instance, direct mea¬ 
surement of the YORP mechanism in at least one asteroid 
(Lowry et al. 2007) reveal a relative rotation period change 
of several parts per million over several years. Any mea¬ 
sureable period changes would likely be due to recent colli- 
sional events. The case of asteroid 596 Scheila (Bodewitts 
et al. 2011) demonstrates that detectable collisional events 
among main-belt asteroids do occur on a relatively regular ba¬ 
sis, though even this robustly-detected collision imparted no 
measurable change in the asteroid’s spin rate (Shevchenko et 
al. 2013). 

Although Figures 7 and 8 detail the period-htting reliabil¬ 
ity as a function of only eight lightcurve parameters, we con¬ 
struct our classiher using 12 additional parameters, for a total 
of twenty lightcurve parameters. In the context of machine¬ 
learning these parameters are known as features. The twenty 
features we use were chosen on the basis of their availabil¬ 
ity (most are output directly by the fitting process and do not 
require additional computation) as well as their actual impor¬ 
tance (as computed during the cross-validation tests described 


in the next section). 

Our twenty lightcurve features are listed in Table 2, in 
order of decreasing importance. Most of these quantities 
we have discussed already in previous sections in the con¬ 
text of our model and fitting procedure. The list also in¬ 
cludes two features characterizing the magnitude distribution 
of the folded lightcurve: (1) Stetson’s iT-index, a measure 
of the kurtosis borrowed from variable star lightcurve anal¬ 
ysis (Stetson 1996), and (2) a ‘cusp index’ which quantihes 
the extent to which the dimmest 10% of the data points in 
the folded lightcurve deviate from the best fit relative to the 
other 90% of the data points. We designed the cusp index 
to potentially identify eclipsing systems which are poorly fit 
by the two-term Fourier approximation but nonetheless may 
have accurately-ht periods (examples of lightcurves with such 
cusp-like minima appear in Figure 10). Eclipsing binaries 
would be most properly treated with a different model en¬ 
tirely, as would tumbling asteroids (which we also did not 
systematically try to identify in the data, and probably lack 
reliable lightcurve solutions when subjected to this work’s al¬ 
gorithm). 

Figure 9 visualizes the two-dimensional correlation coef- 
hcients for all possible pairs of the 20 lightcurve features. 
Overall, the correlation structure of the training sample quali¬ 
tatively resembles that of the full data set, implying the train¬ 
ing set fairly well represents the overall data set in terms of its 
feature-space structure. On the other hand, the distributions 
(e.g., median value, range of values) of individual features in 
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Figure 11. Definitions of true vs. false and positive vs. negative labels. 
True-positive rate (TPR) is sometimes called the completeness or sensitivity, 
while false-positive rate (FPR) is otherwise known as the false-alarm rate, 
one minus the reliability, or one minus the specificity. 



false positive rate (FPR) 


Figure 12. True-positive versus false-positive rates for the cross-validation 
trials. Such as plot is sometimes referred to as a receiver operating char¬ 
acteristic {ROC) curve. Each trial trains the classifier using a randomly- 
chosen 80% of the known accurate fits and 80% of the known inaccurate fits 
among the 927-lightcurves that have reference periods. The 20% remaining 
lightcurves serve as the test sample. Moving along the hyperbolic locus of 
points in this plot is equivalent to tuning the classification probability thresh¬ 
old from zero (lower left of the plot) to one (upper right of the plot). The 
errorbars represent the scatter in the 1,000 cross-validation trials. 


the training set do not necessarily match the distributions in 
the full data set: this is evident for the several features plotted 
in Figure 14. An obvious example is that the full data set con¬ 
tains far more faint asteroids than does the training sample, 
even though in both cases the median magnitude (medMag) 
is positively correlated with quantities like rmsFit (due to 
Poisson noise) and hMagRef (since larger asteroids tend to 
be brighter). 

5.2.2. Classifier cross-validation 

To ascertain the trained classifier’s capabilities, and to en¬ 
sure that the classifier is not overfitting the training data, we 
perform a series of 1,000 cross-validation trials. In each trial 
we split each class (accurate fits and inaccurate fits) into a 




Figure 13. Varying the number of features that are randomly split per node in 
the decision-tree-building process affects both the TPR and FPR. The values 
plotted here correspond to the p > 0.5 classification threshold; each point 
was generated by the exact same process for which the results in Figure 12 
were generated, only varying the number of features with respect to which 
nodes are split. In the left plot, the first four points are labeled with the num¬ 
ber of features for that trial (for n > 4 we omit the label). In our actual im¬ 
plemented model (Figure 12) we chose n = 4 features, the value after which 
the TPR/FPR ratio plateaus at approximately 2, and also the value Breiman 
(2001) recommends, i.e., the square-root of the total number of features (in 
our case, VW ta 4). 

training subsample (a randomly chosen'^ 80% of the class) 
and a test subsample (the remaining 20% of the class). We 
then train a classifier using the combined training subsamples 
and subsequently employ the classifier on the combined test 
subsamples. In each of the trials, the classifier outputs a clas¬ 
sification probability (score) for each object in the test sam¬ 
ple, and we track the true positive rate (TPR; fraction of accu¬ 
rate period fits that are correctly classified above some thresh¬ 
old probability) as a function of the false-positive rate (FPR; 
fraction of inaccurate period fits that are incorrectly classified 
above said threshold probability). See Figure 11 for a sum¬ 
mary of these terms. 

The results of the cross-validation are shown in Figure 
12. By tuning the minimum classification probability used to 
threshold the classifier’s output, one effectively moves along 
the hyperbola-shaped locus of points in TPR-vs.-FPR space 
seen in the plot. Several points have labels (p = ...) indicat¬ 
ing the corresponding threshold probability (adjacent points 
being separated by Ap = 0.05). The errorbars in Figure 12 
represent the standard deviation of the location of each point 
over all 1,000 trials, while the point centers are the average 
locations. 

A classification threshold of p > 0.5 is conventionally 
used when quoting single false-positive and true positive 
rates. In our case, this gives FPR = 0.45 ± 0.07 with TPR 
= 0.89 ± 0.03. The contamination of positively-classified 
lightcurves in the cross-validation trials depends also on the 
actual class ratios in the sample being classified. In particu¬ 
lar, since ~l/3 of our known-period lightcurves are inaccu¬ 
rate fits (Figure 5), it follows that among all lightcurves the 
classifier labels as accurate fits, the contaminated fraction is 
(0.45 X 1/3)7(0.89 X 2/3-F 0.45 X 1/3) « 1/5. If instead of 
using the classifier we just randomly labeled some fraction of 
the lightcurves as accurate and the rest as inaccurate, the re¬ 
sulting contamination would be 1/3 (i.e., worse than the 1/5 
afforded by the classifier, as expected). 

Several parameters can be adjusted or tuned when training a 
random forest classifier. First is the number of decision trees 

Another standard, slightly different approach is to evenly split the train- 
ing data into k disjoint sets (a process called fc-folding). Also, our choice to 
separately partition the two classes into training and test subsamples could 
be omitted. 
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generated during the training stage. Classification accuracy 
typically increases with the number of trees and eventually 
plateaus. Most applications employ hundreds to thousands of 
trees; we here use 1,000 trees. Another tunable parameter is 
the number of randomly-selected features (out of the 20 to¬ 
tal here considered) with respect to which nodes are split in 
building the decision trees. Breiman (2001) recommends us¬ 
ing the square root of the number of features. We ran the 
cross-validation for all possible numbers of features with re¬ 
spect to which the nodes can be split (i.e., all numbers be¬ 
tween 1 and 20). The results are in Figure 13. We chose 
n = 4 as the number of features to split, both because the 
classifier’s performance plateaus after that value and because 
it follow’s the recommendation of Breiman (2001) (4 «v^) 
features. 

Other parameters that can be tweaked are the maximum 
depth of a tree, the minimum number of samples per leaf, the 
minimum number of samples used in a split, and the maxi¬ 
mum number of leaf nodes. We do not constrain any of these 
parameters, meaning we allow: trees of any depth, with any 
number of leaf nodes, leaf nodes consisting of a single sample, 
and splits based on the minimum of 2 samples. We note that 
as a result our model optimization is not comprehensive and 
it is possible a better classifier could be trained. However, the 
relatively small training sample size here is likely the limiting 
factor; additional data is necessary to substantially improve 
the classifier performance. 

In the bottom rows of Figure 7 and 8, we detail the depen¬ 
dence of the TPR and FPR on various lightcurve parameters. 
Averaging (marginalizing) over any of the cc-axis quantities in 
these bottom-row plots (while also weighting each bin by the 
number of lightcurves it contains, cf. the top row of plots in 
Figures 7 and 8), produces precisely the TPR and FPR values 
of the p = 0.5 data point in Figure 12. 

In addition to the TPR and FPR estimates, cross-validation 
allows us to quantify the relative importance of the features by 
computing the average depth in the trees at which a split was 
performed with respect to each feature. Those features with 
respect to which the training sample is consistently divided 
early in the building of each tree are deemed more important 
{i.e., more discriminating) than those features which are split 
later, as the tree-building process tries to maximize the sepa¬ 
ration of the classes as early as possible by splitting features 
in an optimal sequence. Both Table 2 and Figure 9 list the 
features in order of importance. 

Note that we had manually guessed several of the most 
important features—namely, peakRatio, freqSNR and 
hMagErr —prior to any machine-learning work via inspec¬ 
tion of the plots in Figure 8. The numerical importance values 
thus agree with these initial observations, and also quantify 
the significance of features which would be difficult to ascer¬ 
tain manually. For instance, numObsFit appears (in Figure 
8) not to be related to the fitting accuracy while medMag (Fig¬ 
ure 7) does appear related to accuracy (fainter lightcurves be¬ 
ing less accurate), yet these two features evidently have equal 
importance in the classification process (cf. Table 2). Fig¬ 
ure 9 indicates that numObsFit and medMag have quite 
different correlation relationships with respect to more im¬ 
portant features. Hence, it would not be surprising if their 
one-dimensional distributions (in Figures 7 and 8) bear no 
resemblance to the multi-dimensional distributions on which 
the decision trees are defined and in which these two parame¬ 
ters apparently carry comparable weight. 


5.2.3. Machine-vetted lightcurves 

Having trained the machine classifier as described in Sec¬ 
tion 5.2.1, we use it to predict the validity of our remaining 
~53,000 fitted periods (of ^48,000 unique asteroids) which 
lack quality code 3 reference periods in Warner et al. (2009). 
The automated classifier assigned positive reliability scores 
{p > 0.5) to 19,112 of the lightcurves (35% of the total data 
set). Figure 14 details the distribution of the lightcurves (raw- 
fitted, machine-vetted, and other subsets) with respect to some 
of the most important lightcurve features. 

With respect to rotation period (Figure 14 panel A), the 
classifier rejects the largest fractions of lightcurves in the 
long-period (>1 day) and short-period (<2.7 hours) bins. 
From Figure 7 (bottom row, leftmost column), we know that 
the classifier’s completeness does not drop significantly for 
these long- and short-period objects, nor is the false positive 
rate higher among them. Hence we have reason to trust the 
classifier’s heavy rejection of periods in these bins, and there¬ 
fore conclude that our fitting algorithm (Section 4) is prone 
to erroneously fitting periods in these period extremes (as was 
also suggested in the known period sample in Figure 7). 

Panel C shows that the mode of the apparent-magnitude 
(medMag) distribution for machine-approved lightcurves is 
~19 mag, as compared to the predominantly V < 17 mag 
known-period training sample. Comparing this to Figure 
2 panel A shows that the limiting magnitude of reliable 
lightcurves is comparable to that of individual detections. 

Panel E of Figure 14 shows that the raw output of our fit¬ 
ting process contains peak-ratio values that are uniformly- 
distributed above 0.2, this particular value being a hard-coded 
threshold that double-peaked lightcurves (at least those with 
amplitudes >0.1 mag) output by our fitting algorithm must 
satisfy (see Figure 3 and Section 4.1). The classifier’s out¬ 
put clearly indicates that reliability is linearly related to the 
peak ratio, as was also prominently seen in Figure 8. Because 
Figure 8 also indicates that the classifier’s true-positive and 
false-positive rates also relate linearly with peakRatio, we 
conclude that the slope of the peakRatio distribution for 
the machine-vetted lightcurves is likely an upper limit for the 
true slope. 

5.3. Manual screening 

In addition to machine-based vetting, we manually in¬ 
spected all 54,296 of the lightcurves that were output by our 
fitting process. A human screener first studies the ground- 
truth known period examples (Section 5.1) in an effort to learn 
to distinguish between accurate and inaccurate fits. Only the 
Gi 2 fit is considered (as was the case with the automated clas¬ 
sifier), and for each lightcurve the screener inspects precisely 
the amount of information included for example in Figures 4, 
6 and 10 of this paper. Specifically, for each lightcurve the 
screener views a row of four plots: (1) the rotation-corrected 
phase curve, (2) the phase-function-corrected folded rotation 
curve, (3) the periodogram, i.e., the reduced plotted lin¬ 
early against frequency (labeled however with the correspond¬ 
ing period), and (4) the rotational-phase vs. phase-angle 
plot. A single screener is presented with these plots through 
a plain-formatted webpage, allowing for efficient scrolling 
through the lightcurves and rapid recording of either a ‘reli¬ 
able’ or ‘unreliable’ rating for each fitted period. In addition, 
all lightcurves in the known-period sample were reinserted 
into the screening list, with their reference periods removed. 
These were thus blindly assessed by the screener, independent 
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Figure 14. Distributions of PTF-fitted lightcurves (and various subsets thereof) in select features/parameters. These plots are histograms with the same binning 
as the top rows of Figures 7 and 8. For better readability we here use line-connected bin points (rather than the stair-plot format used in, e.g.. Figure 5). 
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Figure 15. Example lightcurves for which the machine-based and human-based reliability scores differ. Row 1: Human approved, machine rejected (p = 0.32). 
Row 2: Human rejected, machine approved (p = 0.66). Row 3: Human approved, machine rejected. For this object, the fitted period differs from the known 
reference period of 392 hours by 7%, hence the machine rejects it by definition. Row 4'. Human rejected, machine approved (p = 0.70). 



Figure 16. For the 654 unique asteroids having more than one reliable 
lightcurve fit (either multiple oppositions and/or both R and g band data) 
we plot the log of the relative frequency error, defined as the range of the 
asteroid’s fitted periods divided by the geometric mean of its fitted periods. 
Comparison with Figure 5 suggests that we can deem all cases with eiror 
<3% as consistently recovered periods, and those with greater than 3% error 
as inconsistent fits. 

of their formal (3%-accuracy) classification status. 

The black lines in Figure 14 plot the results of the man¬ 
ual screening, in which a total of 10,059 lightcurves (19% 
of the total set) were deemed ‘reliable’. With respect to the 
machine-approved sample, the human-rated sample is in all 
cases between roughly a factor of ~ 1 to 2 smaller in each bin 
relative to the features examined in Figure 14. In general the 


shapes of the machine-approved and human-approved distri¬ 
butions match fairly closely. Figure 15 shows examples of 
lightcurves for which the machine- and human-based classi¬ 
fiers differed in their rating (we focus on very short and very 
long fitted periods in Figure 15, but many examples exist for 
intermediate periods as well). 

5.4. Asteroids with multiple fitted periods 

A total of 654 unique asteroids have more than one PTF 
lightcurve whose fitted period was labeled as reliable by the 
vetting process described in the previous sections. These 654 
asteroids collectively have 1,413 fits (so that the average mul¬ 
tiplicity is ^2.2 fits per asteroid) and include objects either 
observed in multiple oppositions and/or in both filters during 
one or more oppositions. Figure 16 plots the distribution of 
the relative error in the fitted frequencies of all such multiply- 
fit asteroids, this error being defined as the range of the aster¬ 
oid’s fitted frequencies divided by the geometric mean of its 
fitted frequencies. Just as in Figure 5 (when we compared to 
literature-referenced frequencies), we see a prominent mode 
in the histogram peaking at ^0.1% relative error, with some 
excess for errors greater than ^3% error. There are 63 as¬ 
teroids in particular with relative errors greater than 3%, of 
these only four asteroids have more than two fits. If we as¬ 
sume that, in the remaining 59 pairs of disagreeing periods, 
one of the periods is correct, then the contamination fraction 
of lightcurves based on the sample of multiply-fit asteroids is 
-30/1413 = 4%. 
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Figure 17. Panel A\ Distribution of spin rate and amplitude as functions of infrared-derived diameters (see appendix for diameter data sources), including data 
for 4,040 of our lightcurves. The two-dimensional histograms (left side plots) are column-normalized (see text for details). Panel B: Comparison of the period 
versus amplitude distribution (regular 2D histogram, not column normalized) with max-spin-rate versus amplitude for a uniform density ellipsoid held together 
solely by self-gravity 


6. PRELIMINARY LIGHTCURVE-BASED DEMOGRAPHICS 

In this section we perform a preliminary analysis and in¬ 
terpretation of the demographic trends evident in this work’s 
fitted lightcurve parameters. Forthcoming works and papers 
will more closely examine the population distributions of both 
rotation and phase-function parameters. 

Throughout this section we repeatedly examine variation of 
lightcurve-derived parameters as functions of color index and 
infrared-derived diameters. In the appendix we describe the 
aggregation and characteristics of these two custom data sets 
(compiled from external sources). The color index quantifies 
an asteroid’s probability of membership in the C-type {p = 0) 
versus S-type (p = 1) color-based clusters. Objects which in 
fact belong to neither C nor S groups (e.g., V types, D types) 
will have color indices near p = 0.5 provided they are in fact 
separated from both the C-type and S-type clusters in the 2D 
color spaces considered (see appendix). 

There are many interesting demographic questions address¬ 
able with these lightcurve data which—in the interest of 
space—we do not treat in this work. For example, one could 
examine relationships between lightcurve parameters and or¬ 


bital elements and/or family membership, proximity to reso¬ 
nances, and so on. We are making all of these lightcurve data 
available electronically (Tables 4 and 5, cf. Section 9.3) so 
that the community may use these data to help explore such 
science questions. 

6.1. Disclaimer regarding de-biasing 

The preliminary demographic analyses that follow do not 
take into account fully de-biased distributions of, e.g., spin 
rates, amplitudes, or phase-function parameters. The true¬ 
positive and false-positive rates given in the bottom row of 
plots in Figure 7 and 8 (also, the blue and violet lines in Fig¬ 
ure 14), constitute some of the necessary ingredients for pro¬ 
ducing a fully de-biased data set, however in this work we do 
not attempt to compute the de-biased distributions. 

6.2. Rotation rates and amplitudes 

In Figure 17 we reproduce several of the plots appearing in 
Pravec et al. (2002) and references therein, using this work’s 
much larger data set (characterized by at least an order of 
magnitude larger sample of small objects). Both spin rate and 
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amplitude are examined for the 4,040 objects having diame¬ 
ter data from infrared surveys. Unlike Pravec et al. (2002), 
we are not able to individually plot each lightcurve’s data (the 
^4,000 points would make the plot difficult to render, as well 
as difficult to read); hence we plot these (and other relation¬ 
ships later in this section) using two-dimensional histograms 
where the intensity of each pixel corresponds to the number 
of objects in that bin (darker means more, with linear scal¬ 
ing). Additionally, 2D histograms for which the diameter is 
plotted on the horizontal axis have their pixel values column- 
normalized, i.e., all pixels in each column of the histogram 
sum to the same value. This facilitates the visual interpreta¬ 
tion of period and amplitude variation with diameter, as the 
left-hand side (small-diameter end) of the plots would other¬ 
wise saturate the plot. 

Following Pravec et al. (2002), we include the geometric 
mean rotation frequency as computed from a running bin cen¬ 
tered on each object. The half-width of the bin centered on 
each object is either 250 (data points) or the object’s distance 
from the top or bottom of the sorted diameter list, whichever 
is smallest. This ensures the geometric mean is not contam¬ 
inated at the edges of the plot by the interior values, though 
it also means more noise exists in these edge statistics. The 
geometric mean is the more intuitive statistic for the rotation 
period as compared to the arithmetic mean, since the rotation 
periods tend to span several orders of magnitude. In addition 
to the geometric mean, we plot the 16th and 84th percentile 
values from each running bin. 

The basic observed trend regarding rotation rate is that 
smaller-diameter asteroids rotate faster on average. A slight 
increase in the rotation rate also appears for objects larger 
than ^80 km. Binning the data into a coarser set of three 
diameter bins and normalizing each object’s spin rate by the 
local geometric-mean rate, we see a progression from a near- 
Maxwellian distribution to a progressively non-Maxwellian 
distribution for smaller objects. The rotation rates of a 
collisionally-equilibrated population of rotating particles is 
known to approach that of a Maxwellian distribution {e.g., 
Salo 1987), which for a population of N objects as a func¬ 
tion of rotation frequency / is: 

n{N,f, /peak) = exp [ -I , (32) 

V ^/peak y 7peak j 

where n(N^ f, /peak)c?f is the number of objects in the interval 
(/, / + df) and /peak is the peak frequency (i.e. the frequency 
corresponding to the distribution’s maximum). 

One way of testing how well a Maxwellian actually fits the 
data is the two-sided Kolmogorov-Smirnov (KS) test (Massey 
1951). This test compares an empirical distribution to a refer¬ 
ence distribution (e.g., Gaussian, Maxwellian, or another em¬ 
pirical sample) via a bootstrap method. In particular it com¬ 
putes a statistic quantifying the extent to which the cumula¬ 
tive distribution function differs in the two distributions being 
compared. In our case, we use Equation (32) to simulate a 
large sample (10^) randomly drawn from an ideal Maxwellian 
distribution and compare this simulated sample against the 
99-asteroid sample (of 79 > 40 km) frequencies. Interest¬ 
ingly, this test indicates our 99 large-asteroid normalized fre¬ 
quencies differ from a Maxwellian at nearly the lOcr confi¬ 
dence level, though this could be due in part to the lack of a 
proper de-biasing of the distribution (cf. Section 6.1) 

All of these trends—including the qualitative resemblance 


of a Maxwellian but its formal disagreement—were noted 
previously by Pravec et al. (2002). At the time their 79 < 10 
km size bin contained data on only 231 objects, as opposed to 
our sample of 2,844 asteroids with 79 < 10 km. Conversely, 
our 79 > 40 km bin contains only 99 objects as compared to 
the ^400 large asteroids they took into consideration in com¬ 
paring to a Maxwellian. 

Steinberg & Sari (2015) recently described how collisional 
evolution of large asteroids should actually lead to a Levy 
distribution, which has a significantly longer tail than a 
Maxwellian distribution having the same peak. They com¬ 
pared their theory to spin rates of 79 > 10 km asteroids from 
the LCDB and found in general that the Levy distribution fails 
to fit the spin distribution of large asteroids, suggesting that 
there may be a significant primordial component to the spin 
distribution. Potential primordial contributions to the angu¬ 
lar momentum of asteroids were explored by Harris & Burns 
(1979) and later authors; we will return to this topic in Section 

Our amplitude distribution contains an obvious observa¬ 
tional bias (cf. Section 6.1) in that amplitudes less than ~0.1- 
0.2 mag are generally ill-fit by our modeling procedure (cf. 
Ligure 7) and thus significantly underrepresented in our sam¬ 
ple of reliable lightcurves considered here. Nonetheless, we 
see a clear trend of smaller asteroids exhibiting larger rota¬ 
tional amplitudes, consistent with the idea that larger bodies 
have sufficient surface gravity to redistribute any loose mass 
to a more spherical shape. 

As we have done for the normalized frequency distribu¬ 
tion, we plot diameter-binned normalized amplitudes against 
a Maxwellian distribution, this time merely to guide the eye 
as opposed to validating any hypothetical physical interpre¬ 
tation. The fact that the normalized amplitude distributions 
do not deviate too drastically from the Maxwellian shape at 
smaller diameters indicates that the spread in the amplitude 
distribution is proportional to its mean value, a basic property 
of the Maxwellian distribution, hence the good agreement. 
Carbognani (2010) provides a recent analysis of asteroid ro¬ 
tation amplitudes, and highlighted a similar increase in both 
the amplitude’s mean and spread with decreasing diameter. 

Panel B of Ligure 17 shows the distribution in period-vs.- 
amplitude space, in which we can plot all 9,033 lightcurves, 
including those lacking a diameter estimate. Contours repre¬ 
senting the maximal spin rate of a body held-together solely 
by self-gravity of certain uniform densities are overplotted. 
Our data as a whole do not appear to populate the region be¬ 
yond the ~2 g/cm^ contour. Later in this section we will re¬ 
examine this behavior separately for the two major taxonomic 
classes. 

6.3. Phase-functions and bond albedos 

We consider any of the 54,296 fitted PTL lightcurves to 
have a reliably-fit phase function if both of the following con¬ 
ditions are satisfied: 

1. The lightcurve is one of the 9,033 having a reliable pe¬ 
riod fit, or its fitted amplitude (for the G 12 model) is 
less than 0.1 mag (the latter is true for 1,939 lightcurves, 
only 39 of which have reliable periods) 

2. The lightcurve is fit using data from at least five phase- 
angle bins of width Aa = 3 deg. These five bins need 
not be contiguous, and they need not include phase an¬ 
gles in the region where opposition surges are typically 
measured (i.e., a < 10 deg) 
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Figure 18. Various fitted phase-function parameters plotted against color index and bold albedo (two-dimensional histograms; the total number of lightcurves in 
each plot is stated above it as Af = ...). In the right column of plots, one-dimensional distributions with the color-index classified objects plotted separately. In 
the right column of ID histograms, C and S types are defined as objects with color indices less than 0.25 and greater than 0.75, respectively. 
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Figure 19. Left: We perform the same clustering analysis used in defining the 
color index (see appendix), this time on the G \2 versus Abond distribution, 
which contains 1,631 PTF lightcurves all of which have IR-derived diameters 
and reliable phase functions. The output of this clustering analysis is the pho¬ 
tometric index, which analogous to the color index is a number between 0 (C 
type) and 1 (S type) quantifying to the class membership of each constituent 
asteroid data point. Right: CoiTelation between the color index and our pho¬ 
tometric index, a comparison which can be made for 361 objects. Note that 
most data are in the lower left and upper right comers 



Spearman corr. = 0.12 (1.1 o) 



Spearman corr. = -0.54 (>10a) 
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Figure 20. For the 92 asteroids with both /2-band and p-band lightcurve 
fits from the same opposition, we use the resulting difference in the absolute 
magnitudes Hg — Hr as a proxy for taxonomy. This color distribution is 
qualitatively bimodal (top left), and the correlation with G 12 is very robust 
(top right). We detect no significant difference in the G 12 and/or /3 parame¬ 
ters between the two bands, both in the sample as a whole, and as a function 
of the Hg — Hr color. 

The above two criteria are met by 3,902 out of the 54,296 PTF 
lightcurves. Of these, 1,648 have an infrared-based diameter 
available, 651 have a color index available, and 361 have both 
a diameter and color index. 

Figure 18 details the distributions of the fitted phase param¬ 
eters Gi 2 , G, /3 and C against the color index, bond albedo, 
and in ID histograms with color-based taxonomic subsets. 
Though the phase parameters are all correlated with color in¬ 
dex and with bond albedo, none of the ID phase-parameter 
distributions (right column of plots) exhibit bimodality alone, 
whereas the bond albedo (bottom right plot) does show signif¬ 
icant bimodality. The red and blue histograms consist of all 
asteroids having color metric either less than 0.25 (C types) or 
greater than 0.75 (S types). The G and (/3, G) phase param¬ 
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Figure 21. Added completeness from supplementing the color index with 
the photometric index among asteroids having PTF lightcurves. Both indices 
are a proxy for the taxonomic type. The left- and right-hand plots apply 
separately to the subset labeled by the black line above each column. 


eters are only plotted for those lightcurves which also have a 
Gi 2 solution. Not every lightcurve produced a solution for all 
three of the phase-function models, hence the sample sizes for 
the G and (/3, G) models include a slightly reduced number of 
data points. 

We reiterate our statement from Section 3.2.1 that the bond 
albedo ^bond is a more fundamental (i.e., intensive rather than 
extensive) property than is the geometric albedo pv, hence 
our focus on Abond here. The bond albedo is computed using 
Equation (8) together with Equation (15), and makes use of 
our PTE-derived absolute magnitudes —H from the G 12 fit in 
particular—as well as the phase integral q of Equation (8), 
also computed directly from the G 12 fit for (p. In particular. 


Q{Gi2) 


0.2707 - 0.236Gi 2 if G 12 < 0.2; 
0.2344 — 0.054Gi 2 otherwise. 


(33) 


6.3.1. Taxonomy from lightcurve data 

We use the distribution of bond albedo versus G 12 to define 
another taxonomic metric analogous to the color index. In 
particular, we apply the same clustering analysis to this dis¬ 
tribution as we did for the seven 2D color distributions in the 
appendix. This procedure assigns to every object in the Abond- 
VS.-G 12 diagram a probability of membership in each of two 
clusters (color coded blue and orange in Eigure 19). The clus¬ 
ter centers are fit by the algorithm, and the output class prob¬ 
ability of a given data point relates to its distance from these 
cluster centers. Probabilities near 0 represent likely C-type 
class membership, while probabilities near 1 represent likely 
S-type membership. We refer to this new metric as the photo¬ 
metric index-, it complements the color index as another proxy 
for taxonomy. There are 361 asteroids with both a photomet¬ 
ric index and color index available (Eigure 19 right plot); the 
two indices are clearly correlated (pspeai-man = 0.73, >10cr 
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Figure 22. Taxonomic dependence on spin rate and amplitude, also versus diameter, using the union of the color-index and photometric-index based C/S 
taxonomy. 


significance). Note that asteroids only have a dehned photo¬ 
metric index if they have an infrared-derived diameter avail¬ 
able, so that Abond is dehned. 

6.3.2. Wavelength dependence 

Observational evidence for the reddening of asteroid col¬ 
ors with increasing phase angle is discussed by Sanchez et 
al. (2012) and references therein. Color variation with phase 
angle can be equivalently stated as variation of the phase func¬ 
tion with wavelength. Asteroids which have PTF lightcurves 
in both of the survey’s hlters {R and g band) allow us to in¬ 
vestigate this phenomenon. We note however that (Sanchez 
et al. 2012) describe phase reddening as being more pro¬ 
nounced at longer wavelengths (>0.9 pm) and larger phase 
angles (a > 30 deg), such that a priori we should not expect 
a very pronounced effect (if any) in the visible band PTF data. 

Similar to the complication associated with comparing spin 
amplitudes from multiple oppositions (Section 3.1.1), an as¬ 
teroid’s mean color can potentially change if the spin axis 
varies with respect to our line-of-sight from year to year. 
Hence, we choose not to compare i?-band and p-band phase- 
function fits from different oppositions. Aside from this con¬ 
straint, we adopt the same two reliability selection criteria 
stated in Section 6.3, with a slight modihcation of requirement 
#2; here we allow four or more phase-angle bins of width 
Aa = 3 deg, as opposed to the previous sections’ five-bin 
requirement, because of the small sample size. 

There are 92 asteroids with both i?-band and p-band phase- 
function hts acquired during the same opposition that meet 
the above criteria. For each asteroid we difference the i?-band 
Gi 2 value from the p-band G 12 value. The mean of this differ¬ 
ence is —0. 004ti];i9 , indicating (for the whole sample) no sig- 
nihcant non-zero difference between the two bands’ G 12 val¬ 
ues. Likewise, for /?, we compute a difference of 0.002lg gQ 3 , 
also consistent with zero difference between the bands. 

Since these hts provide absolute magnitudes in each band 


(j.e., Hg and Hp) we compute the color Hg — Hp for the 92- 
asteroid sample. Figure 20 shows that the distribution of this 
color is bimodal, suggesting it is a viable proxy for taxonomy. 
This is further supported by the strong correlation between 
Hg — Hp and the ii-band G 12 value. No correlation is seen 
however between Hg — Hp and the difference between the 
two bands’ G 12 value or values. 

6.4. Spins and amplitudes vs. taxonomy 

The union of the color-index data (see appendix) and 
photometric-index data (Section 6.3.1) provides signihcantly 
better taxonomic coverage of the PTF lightcurves (Figure 21). 
With this composite taxonomic information in hand, we can 
repeat the spin-amplitude analyses of Section 6.2 (Figure 17), 
this time considering the C-type and S-type groups separately. 
We dehne objects with one or both of the indices less than 
0.25 as C type and greater than 0.75 as S type. We detail 
the resulting 1,795-object taxonomically-classified sample in 
Figure 22. There were 20 asteroids with conflicting color- 
based and photometric-based classihcations that are not in¬ 
cluded in this 1,795-object sample. 

The one-dimensional histogram in Figure 22 indicates that 
S-type asteroids dominate the smallest objects with data in 
PTF while C type dominate the largest. This reflects the fact 
that the survey’s upper and lower sensitivity limits are dehned 
in terms of absolute magnitude H (affected by albedo) rather 
than physical diameter, i.e., S-type asteroids larger than ^50 
km will tend to saturate the PTF detector, while C-type aster¬ 
oids fainter than ~5 km will usually fall below the detection 
limit. Adding to this effect is the fact that S-types mostly oc¬ 
cupy the inner main-belt, where they are brighter by virtue of 
smaller heliocentric and geocentric distances, as compared to 
the usually more distant C types. While the two classes have 
similar representation in the sample (882 S types versus 913 
C types), their true population ratio also affects the relative 
numbers. 
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R = V + (0.00±0.10) 



S types: R = V + (0.00±0.10) 
C types: R = V + (0.02±0.11) 
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Figure 23. Left: Transformations between MPC V band and the PTF R and 
g bands for asteroids, based on the difference between MPC-fitted and PTF- 
fitted H magnitudes for asteroids whose PTF-fitted G values are in the range 
0.10 < G < 0.20 as well as other PTF-coverage constraints (see text). 
Right: il-band data only, with S and C types defined with either color and/or 
photometric indices (again using the <0.25 and >0.75 index thresholds). 
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Figure 24. Errors in the MPC-listed absolute magnitudes relative to the PTF 
H values (in R band and using the G 12 fit’s H value), only considering aster¬ 
oids with IR-derived diameters. On the right is the con'esponding geometric 
albedo relative error. Pixels in the 2D histograms shown here are column 
normalized. The running-bin geometric mean and 16th and 84th percentiles 
are shown as green and red lines. Yellow dashed lines are the mean and 84th 
percentile expected from the 0.1 mag transformation uncertainty alone (for 
1% geometric albedo). 


The right-hand side plots in Figure 22 show rotation rate 
and amplitude versus diameter separately for the two taxo¬ 
nomic groups. Rather than plot a two-dimensional histogram 
as was done in Figure 17, for readability we here just plot 
the geometric mean and percentiles, computed by exactly the 
same running-bin method described in Section 6.2. The most 
prominent trend is that among 5 < {D/km) < 20 asteroids, 
C types appear to rotate slower than S-types and have larger 
amplitudes than S types. Assuming both asteroid groups share 
the same mean angular momentum, the discrepancy could re¬ 
flect the C types’ ability to more efficiently redistribute ma¬ 
terial away from their spin axis, thereby increasing their mo¬ 
ment of inertia (amplitude) while decreasing their angular ro¬ 
tation rate {i.e., a simple manifestation of conservation of an¬ 
gular momentum). 

The above-stated assumption of a common mean angular 
momentum between C and S types is a merely a simple case 
and is neither unique nor rigorously motivated. More care¬ 
ful consideration of, e.g., plausible ranges of internal tensile 
strengths of the two types could easily lead to more diverse 
scenarios wherein the two groups actually have different an¬ 
gular momenta and the observed spin-amplitude trends. As 
noted earlier (Section 6.2), large asteroids in general appear 
to have retained a significant primordial component in their 
spin distribution (Steinberg & Sari 2015); it is therefore im¬ 
portant that differences in the origin of C types and S types 
(accretionary, temporal and/or spatial) be taken into account 
along with differences in collisional evolution and differing 
contributions from radiative forces like YORR Simulations of 
the main belt’s origin, such as the Grand Tack family of mod¬ 
els (Walsh et al. 2011), should ultimately be modified to track 
particle spin evolution as well as orbits. 

We also reproduce the period-vs.-amplitude plot first shown 
in Figure 17, this time plotting separately the two taxonomic 
groups. The S types show a clearer cutoff at the 2 g/cm^ con¬ 
tour line, suggesting they may in general be of greater bulk 
density than the C types, which show a softer boundary in 
this period-vs.-amplitude space, the precise location of which 
appears to be somewhere between 1 and 2 g/cm^. Note that 
comparison to these density contours is only valid if the aster¬ 
oids in consideration are held together mostly by self-gravity 
and approximated as fluids (as opposed to having significant 
internal cohesive or frictional resistance). These results are 


in general agreement with existing asteroid density estimates 
(Carry 2012 and references therein). Results from an indepen¬ 
dent analysis of a smaller, more densely-sampled set of PTF 
asteroid lightcurves (Chang et al. in review; a study that fol¬ 
lows closely the approach of Chang et al. 2014a) agree with 
the C type vs. S type rotation rate discrepancy discussed here. 

7. COMPARISON TO MPC-GENERATED MAGNITUDES 

Absolute magnitudes available through the Minor Planet 
Center (MPC) and JPL Solar System Dynamics'^ websites are 
fit using all available survey/observer-contributed photome¬ 
try. These H magnitudes are used in various online ephemeris 
tools to compute predicted V magnitudes to accompany astro- 
metric predictions. Their model assumes no rotational mod¬ 
ulation, uses the Lumme-Bowell G-model (Section 3.2.2), 
and—with the exception of ~100 large objects (nearly all 
with D > 30 km)—assumes a constant G = 0.15 for all 
asteroids. Our results (Figure 18 second row of plots) show 
that the G = 0.15 approximation does indeed agree well with 
the peak of the distribution of fitted G values. The PTF-fitted 
G values obviously however show some spread and variation 
with taxonomy. In this section we explore the resulting dif¬ 
ferences in the absolute magnitudes H and in predicted mag¬ 
nitudes. 

7.1. Filter transformations 

In order to compare the MPC-listed (T/mpc) magnitudes, 
which are in V band, with RTF’s absolute magnitudes (iTpxF, 
corresponding to the G-model fit) which are in either R and g 
bands, we must first compute an approximate transformation 
from V -band to each PTF band. While some transformations 
are given by Ofek et al. (2012a), we here prefer to empir¬ 
ically estimate these using actual asteroid photometry from 
both PTF and the MPC, rather than generating them from the 
more general transformations of Ofek et al. (2012a). 

Figure 23 plots T/ptf — I^mpc for asteroids whose PTF- 
derived Gptf is in the range 0.1 < Gptf < 0.2. By restrict¬ 
ing the comparison to objects with fitted Gptf values close 
to 0.15, we in principle select F/mpc magnitudes for which 
the MFC’s Gmpc = 0.15 assumption is actually valid (none 
of the asteroids in Figure 23 have MPC-listed G values other 

http://ssd.jpl.nasa.gov 
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lightcurvG RMS (mag) 

Figure 25. Compaiison of the root-mean-square residuals, with respect to 
the PTF {H, G 12 ) plus rotation fit and the MFC (H, G) fit, for all lightcurves 
having a reliable i?-band PTF phase-function fit. 


than the default 0.15). Furthermore, we only consider (in Fig¬ 
ure 23) asteroids with PTF data in at least three phase angle 
bins of Aa = 3 deg and either a reliable period or fitted am¬ 
plitude less than 0.1 mag. 

Comparing the Hmpc and iTpTp magnitudes for this specific 
subset of asteroids, we obtain approximate transformations 
R^V + (0.00 ± 0.10) and p = V -f (0.55 ± 0.16). The Icr 
uncertainties of 0.10 and 0.16 mag plausibly include a com¬ 
bination of the photometric calibration uncertainties of both 
the MPC data (coming from a variety of surveys/observers), 
variation in H magnitude of a given asteroid between differ¬ 
ent oppositions (the MPC fits combine data possibly acquired 
at different viewing geometries), as well as the range of Gptf 
used in selecting the asteroids in this sample. Consideration 
of a range of Gptf values is equivalent to considering a range 
of asteroid colors (cf. the color-vs.-G correlation seen in Fig¬ 
ure 18). Flence the uncertainties in these transformations also 
encompass the variation which might otherwise be formally 
fit in a color term for the transformations. Such a color term 
for RtoV would almost certainly be less significant than that 
of g to V, as the former transformation is already zero within 
uncertainties. The larger uncertainty in the g to V transfor¬ 
mation is likely attributable to both the smaller sample size 
and the fact that the V bandcenter is further displaced from g 
than from R, such that color variation has a more pronounced 
effect. 

Given the above-computed transformations, and the fact 
that 89% of our fitted lightcurves are in R band, we pro¬ 
ceed using only ii-band lightcurve fits, which we compare 
directly against MPC magnitudes (or formally, after applying 
the transformation of zero). A detail of the color dependence 
of the RtoV transformation appears in the right plot of Figure 
23; the mean transformation differs slightly between S and C 
types but not at a level comparable to the uncertainty in either. 

7 . 2 . Absolute magnitudes 

In Figure 24 we show the relative error in the MPC abso¬ 
lute magnitudes as compared to the PTF magnitudes, for all 
1,630 lightcurves with sufficient phase angle coverage in PTF 
(with the five-bin phase-angle criterion). These errors should 
reflect not only any discrepancy due the different phase func¬ 
tion models (PTF’s G 12 versus MFC’s G), but also variation 
in absolute photometric calibrations (within the MPC data in¬ 
ternally and/or between the MPC and PTF data sets). The 
0.1-mag uncertainty in the Rio V band transformation has 
a prominent contribution to the errors shown here (the mean 
and 84th percentile of the errors expected from the 0.1-mag 
transformation uncertainty alone are shown as yellow dashed 
lines, and assume pv = 0.07). The green line (computed 
mean) and upper red line (84th percentile) indicate the er¬ 


rors are ^1% greater than those expected from the transfor¬ 
mation uncertainty alone, though this increases slightly for 
the largest (D > 30 km) objects. Note that many of these 
largest asteroids are more frequently observed by programs 
other than the major sky surveys; these smaller facilities tend 
to use smaller aperture telescopes and different absolute cali¬ 
bration standards, which would contribute to the error. 

7 . 3 . Predicted apparent magnitudes 

Instead of comparing just the fitted H magnitudes, for every 
lightcurve with a reliable PTF-fitted phase function we also 
compare the root-mean-square residual of all PTF data in that 
lightcurve with respect to both our Gi 2 -fit-predicted R mag¬ 
nitude and the MPC (G = 0.15) predicted V magnitude. Our 
fit includes more fitted parameters and obviously should result 
in smaller residuals; Figure 25 shows that we see a factor ~3 
smaller residuals in particular using the PTF fit. Note that if 
the 0.1-mag R-to-V transformation uncertainty were the only 
significant contributor to the MPC residuals then their peak 
would instead be at ~0.07 mag rather than ~0.25 mag. Ig¬ 
nored rotational modulation and inaccurate phase functions 
move the MPC residuals distribution to higher RMS values. 

The lower RMS residuals afforded by the PTF lightcurve 
model permit a more sensitive search for low-level transient 
activity {e.g., collisional events, cometary brightening) in 
these asteroids. For example, Cikota et al. (2014) perform a 
search for active main-belt asteroids using photometric resid¬ 
uals of all MPC data taken with respect to the MPC-predicted 
apparent V magnitudes. We currently are pursuing a sim¬ 
ilar analysis using these PTF lightcurves, as a follow-up to 
the morphology-based search already completed with PTF 
(Waszczak et al. 2013). A hybrid approach, wherein morpho¬ 
logical measurements are made on stacked images of aster¬ 
oids which have reliable lightcurve fits, could further reveal 
this kind of subtle activity. 

8. SUMMARY 

From five years of PTF survey data we have extracted over 
4 million serendipitous detections of asteroids with known or¬ 
bits. We fit a photometric model to ~54,000 lightcurves, each 
consisting of at least 20 observations acquired within a given 
opposition in a single filter. We adopt a second order (four- 
term) Fourier series for the rotation component and fit three 
distinct phase-function models. We assess the reliability of 
our retrieved rotation periods by subjecting them to both an 
automated classifier and manual review. Both vetting pro¬ 
cesses are trained on a sample of ~800 asteroids with pre¬ 
viously measured spin periods that also occur in our sample. 
We consider the intersection of the two screened samples for 
subsequent analysis. 

Preliminary analysis (on distributions that are not de- 
biased) of the rotation period versus diameter confirms the 
previous finding that asteroids smaller than ~ 40 km do not 
conform to a Maxwellian distribution in their normalized spin 
frequencies. Phase-function parameters are shown to corre¬ 
late strongly with the bond albedo. None of the phase function 
parameters display bimodality in their measured distributions 
however. Together with the bond albedo, we use the phase 
function data to define a new taxonomic metric based solely 
on single-band lightcurve properties together with infrared- 
derived diameters (G 12 and Abond)- This metric complements 
the color-based index established previously by many visible- 
color and spectroscopic surveys. Combining these color- and 
photometry-based taxonomic indices allows us to separately 
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examine the spin and amplitude distributions of the C-type 
and S-type asteroids with the largest possible sample sizes. 
Doing so reveals that, among small objects (5 km < D < 20 
km) the C types show larger amplitudes and slower spin rates. 
If the two populations shared a common angular momentum 
distribution, this could be interpreted as the two composi¬ 
tional types’ differing tendencies to redistribute mass away 
from their spin axes. Comparison of the spin-amplitude dis¬ 
tribution with contours of maximal spin rates for cohesionless 
bodies suggests that almost all asteroids are less dense than 
~2 g/cm^, with C types displaying a potentially less dense 
upper limit of between 1-2 g/cm^. 

Finally, our fitted absolute magnitudes differ from those 
generated by the Minor Planet Center’s automated fitting pro¬ 
cedures, though the precise discrepancy is difficult to as¬ 
certain given uncertainty in the transformation between PTF 
i?-band and the MFC’s C-band. The utility in using our 
model to predict asteroid apparent magnitudes is seen in the 
three-fold reduction in RMS scatter about our model rela¬ 
tive to the fiducial G = 0.15 model that neglects rotation. 
This reduced scatter is an essential prerequisite for sensitive 
searches for cometary, collisional, and other transient activity 
in what would otherwise be regarded as quiescent asteroids— 
potentially even bright objects. 
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9. APPENDIX 

9.1. Multi-survey visible-band color index 

The purpose of this appendix section is to introduce a one¬ 
dimensional color metric, based upon data from seven differ¬ 
ent colorimetric asteroid surveys, which quantifies an aster¬ 
oid’s first-order visible-band color-based taxonomy as a num¬ 
ber between 0 (C-type endmember) and 1 (S-type endmem- 
ber). Our primary motivation for doing this is to enable a uni¬ 
form comparison of PTF-lightcurve-derived parameters with 
color spanning from the brightest/largest objects {H ss 8-9 
mag, or D K, 125-80 km diameters) down to PTE’s detec¬ 
tion limit for main-belt asteroids {H sa 16 mag, or 19 « 2- 
4 km). Figure 26 panel A shows that the fraction of PTF 
lightcurves with color information increases by a factor of ^3 
among large asteroids when all seven surveys are considered, 
whereas for smaller objects the Sloan Digital Sky Survey’s 
(SDSS; York et al. 2000; Ivezic et al. 2002; Parker et al. 2008) 
moving-object catalog provides essentially all of the color in¬ 
formation. 

The seven surveys we use are described in Table 3. All 
of these surveys contain at least two independent color mea¬ 
surements, and when plotting their data in these two dimen¬ 
sional spaces (or 2D subspaces defined by properly-chosen 
principal components or spectral slope parameters), the first- 
order C-type and S-type clusters are in all cases prominently 
seen (Figure 26 panel B). To each such 2D color distribution 
we apply a two-dimensional/wzzy c-means (FCM) clustering 
algorithm (Bezdec 1981; Chiu 1994). For each survey data 
set, FCM iteratively solves for a specified number of clus¬ 
ter centers (in our case, two) in N dimensions (in our case 
one dimension) by minimizing an objective function which 
adaptively weights each datum according to the robustness of 
its membership in a given cluster. The FCM output includes 
computed cluster centers and, for each datum, the probability 
that the datum belongs to each cluster (this being related to 
the datum’s distance from each cluster center). 

In the color-distribution plots of Figure 26 panel B (the 
plots with black backgrounds arranged diagonally), each 
pixel/bin is colorized according to the average cluster- 
membership probability of asteroids in that pixel. Blue in¬ 
dicates high probability of membership in cluster 1 while or¬ 
ange represents high probability of membership in cluster 2. 

Our color index provides a more quantitative label than 
that offered by popular letter-based taxonomic systems (e.g.. 
Bus et al. 2002 and refs, therein). Several such letter-based 
nomenclatures were in fact defined on the basis of one or 
more of these seven surveys, oftentimes using a method sim¬ 
ilar to the clustering technique we use here. We identify 
our blue cluster with C-type asteroids and our orange clus¬ 
ter with S-type asteroids, though we make this association 
purely for connection/compatibility with the literature. This 
is because our computed clusters have their own unique iden¬ 
tity/definition, formally distinct from that given in any other 
work. Our clusters’ definitions are nonetheless completely 
specified/reproducible by the FCM algorithm we used to com¬ 
pute them. 

In reducing the taxonomic classification to a single number 
defined by the two most prominent groups (C and S types), we 
lose the ability to distinguish secondary classes like V types, 
D types, and so on. If such a sub-group is separated from both 
of the two main clusters, its members will be assigned mem¬ 
bership scores of close to 0.5. For example, in the SDSS a* 
vs. i — z complex, the clearly-seen V-type ‘tail’ protruding 
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Table 3 

Asteroid colorimetry data sets used in computing this work’s C/S color metric. These data sets are visualized in Figure 26. 


survey name 

references 

data description 

# asteroids 

UBV colors 

Bowell et al. (1978) 

Tedesco (1995) 

U, B, and V broadband photometry acquired mostly at 

Lowell Observatory in the 1970s with photomultiplier tubes. 

902 

Eight-Color Asteroid 
Survey (EGAS) 

Zellner et al. (1985) 

Zellner et al. (2009) 

Photometry in eight custom filters measured with photomultipliers at Catalina and 
Stewai'd Observatories. We compute and use the principal component color index 

PC#1 = 0.771(6 — u) — 0.637(1? — w). Excludes objects with PC#1 error >0.3 mag. 

480 

24-Color 

Asteroid Survey 

Chapman & Gaffey (1979) 
Chapman et al. (1993) 

Photometry in 24 interference filters measured with photomultipliers at Mauna Kea. 

We compute and use the mean spectral reflectance slope and first principal component. 

262 

Small Main-belt 
Asteroid Spectroscopic 
Survey (SMASS) 

Xu et al. (1995) 

Xu et al. (1996) 

CCD spectroscopy (0.4-1.0/im, R ps 100) conducted mostly at Kitt Peak. 

We compute and use the mean spectral reflectance slope and first principal component. 

305 

Small Main-belt 
Asteroid Spectroscopic 
Survey II (SMASS-2) 

Bus & Binzel (2002) 

Bus & Binzel (2003) 

CCD spectroscopy (0.4-1.0/im, R ^ 100) conducted at Kitt Peak. 

We compute and use the mean spectral reflectance slope and first principal component. 

1,313 

Small Solar System 
Objects Spectroscopic 
Survey (S30S2) 

Lazzaro et al. (2004) 

CCD spectroscopy (0.5-9.0/im, R 500) conducted at ESO (La Silla). 

We compute and use the mean spectral reflectance slope and first principal component. 

730 

Sloan Digital Sky 

Survey (SDSS) 
griz colors 

Ivezic et al. (2002) 

Parker et al. (2008) 

Ivezic et al. (2010) 

g,r,i, and 2 : broadband CCD photometry acquired by SDSS from 1998-2009. 

Includes data in the Moving Object Catalog v4, supplemented with post-2007 detections 
from SDSS DRIO. We use the first principal component a* defined in the references. 
Excludes objects with a* error >0.05 mag or {i — z) error >0.1 mag. 

30,518 


down from the S-type cluster appears mostly green in color, 
reflecting its intermediate classification. Likewise for the less- 
clearly seen D types, which in the SDSS plot lie above the S 
types and to the right of the C types (again in a green-colored 
region). The orders of magnitude lower numbers of such sec¬ 
ondary types make them mostly irrelevant for the purpose of 
this analysis. 

We compute the numerical uncertainty (variance) of a given 
asteroid’s cluster-membership score in a particular survey by 
performing many bootstrapped trials wherein we first ran¬ 
domly perturb all data points by random numbers drawn from 
Gaussian distributions whose width are the quoted Icr mea¬ 
surement (i.e., photometric) uncertainties in each of the two 
dimensions, and then repeat the FCM analysis on the per¬ 
turbed data. The variance in each object’s reported cluster 
probability is then computed after a large number of bootstrap 
trials. 

Some asteroids appear in only one of the seven surveys; for 
such objects the color index is simply its cluster-membership 
score in that particular survey. For asteroids appearing in 
multiple surveys, we take the variance-weighted average of 
the multiple membership scores (and compute that compos¬ 
ite score’s variance by summing the component variances in 
inverse quadrature, as usual). 

The many off-diagonal plots in Figure 26 panel B compare 
the cluster-membership scores of all asteroids appearing in 
all possible survey intersections. The number of asteroids in 
each survey (and in the intersection of each survey pair) ap¬ 
pears above each plot (N — ...). The survey-pair distribu¬ 
tions are 2D-histograms where higher densities of data points 
correspond to black pixels/bins and low density or lack of data 
points is white. Evidently all possible survey combinations 
contain at least some asteroids (several share hundreds), and 
in all cases the individual taxonomic indices (on the horizon¬ 
tal and vertical axes) correlate strongly, confirming the con¬ 
sistency of the cluster membership between surveys. 

In Figure 27 we illustrate some useful applications of this 
color index by comparing it with various asteroid surface ob¬ 
servations. One of these quantities (SDSS a* color) was used 
in computing the color index, so its correlation with the clus¬ 


tering index is expected and thus confirmed. 

In the leftmost plot of Figure 27, asteroid photometry from 
GALEX^^ (NUV band), compiled by Waszczak et al. (in 
prep), is normalized by the nominal G = 0.15 phase-model 
(Section 3.2.2) predicted brightness at the time of the GALEX 
observations, and the resulting NUV — V color evidently cor¬ 
relates with the visible color index. This indicates that aster¬ 
oid reflectance slopes in the visible persist into the UV. 

Figure 27 also plots our color index against the W 1-band 
geometric albedo derived from WISE^^ observations obtained 
during its fully cryogenic mission. We only include aster¬ 
oids which were detected in both of the thermal bands {Wi 
and WA) and which therefore have a reliable diameter esti¬ 
mate. Use of this diameter in Equation (15) then permits es¬ 
timation of the albedo, where the kFl-band albedos uses the 
corresponding WISE photometry (H in Equation [15] being 
replaced with the appropriate kFl-band absolute magnitude). 

The rightmost plot in Figure 27 shows our color index’s 
relationship to a near-infrared color from the ground-based 
2MASS survey (Skrutskie et al. 2006). Serendipitous aster¬ 
oid detections were extracted from 2MASS by Sykes (2000, 
2010) and include fluxes in J band (1.25 pm), H band (1.65 
pm— not to be confused with the absolute visible magnitude 
H, used elsewhere in this work), and K band (2.17 pm). 

Figure 28 plots our color index against proper orbital el¬ 
ements retrieved from the Asteroids Dynamic Site (AstDyS; 
Knezevic & Milani 2012), revealing the distinct colors of dy¬ 
namical families and the overall transition from S to C types 
with increasing semi-major axis. These are similar to the plots 

The Galaxy Evolution Explorer (GALEX) is a NASA Small Explorer- 
class space telescope which from 2003-2012 conducted an imaging survey 
in a far-UV band (FUV, 130-190 nm) and a near-UV band (NUV, 180-280 
nm). Martin et al. (2005) discuss the extragalactic science program; Morissey 
et al. (2005, 2007) discuss the on-orbit performance, survey calibration and 
data products. The Waszczak et al (in prep) NUV data shown here are derived 
from data available at http://galex.stsci.edu. 

The Wide-field Infrared Survey Explorer (WISE) is a NASA Medium 
Explorer-class space telescope which in 2010 conducted a cryogenic IR imag¬ 
ing survey in four bands: Wl, W2, W3, and W4, centered at 3.4, 4.6, 12, 
and 22/rm, respectively. Wright et al. (2010) details mission/performance; 
Masiero et al. (2011) and refs, therein present preliminary asteroid data. 
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Figure 26. Panel A\ Fraction of PTF lightcurves with colorimetric data available, for both the reliable-period and reliable-period-plus-Gi 2 sets of lightcurves. 
Panel B\ Two-dimensional color distributions for seven surveys, and correlations of FCM-clustering-derived classifications between all pairs of surveys. 


of Parker et al. (2008), which is not surprising given that the 
majority of the asteroids’ color indices are based on SDSS 
data alone. Of the 32,5023 asteroids with a defined color in¬ 
dex, there are 30,508 with proper orbital elements which are 
represented in Figure 28. 

9.2. Compilation of IR-derived diameters 

Similar to how we combined several surveys’ colorimet¬ 
ric data in the previous section, here we compile thermal- 


infrared-derived diameter estimates from four surveys. Our 
aim is again to provide the largest possible sample for com¬ 
parison with PTF-derived lightcurve data. Just as SDSS is 
the main contributor of colorimetry overall but suffers from 
incompleteness for large/bright asteroids, analogously WISE 
provides the vast majority of IR-based diameter measure¬ 
ments but levels off at ~80% completeness at the bright end 
(Figure 29). We thus supplement WISE with diameter data 
from the Infrared Astronomical Satellite (IRAS', Matson et al. 
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Figure 27. Relationship between various asteroid surface measurements (from the UV to near-IR) and this work’s visible-color-derived C/S color index. See 
text for descriptions of data the data sets used here, and accompanying references. 
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Figure 28. Relationship between proper orbital elements and this work’s 
visible-color-derived C/S color index for 30,508 asteroids. 


1986, Tedesco et al. 2002), the Mid-Course Space Experiment 
(MSX-, Tedesco et al. 2002), and AKARI (Usui et al. 2011). 
Usui et al. (2014) compares several of these different data sets 
in terms of coverage and accuracy. As we did when defining 
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Figure 29. Fraction of PTF lightcurves with thermal-IR-based diameter 
estimates available, for both the reliable-period and reliable-Gi 2 sets of 
lightcuiwes. 


the color index, asteroids occurring in multiple IR surveys are 
assigned the variance-weighted average diameter. 

Regarding the WISE data in particular, we again use only 
those diameters which resulted from a thermal fit constrained 
by fluxes in all four WISE bands during the cryogenic mis¬ 
sion. Eurthermore, we use the latest (revised) diameter esti¬ 
mates published by Masiero et al. (2014), which adopted an 
improved thermal modeling technique first discussed by Grav 
et al. (2012). 


9.3. Lightcurve data tables 

The online version of this article includes two electronic ta¬ 
bles containing the derived lightcurve parameters and the in¬ 
dividual photometric observations in each lightcurve. Tables 
4 and 5 describe the columns and formatting of these tables, 
which include data on all reliable-period lightcurves as well 
as those having amplitudes less than 0.1 mag and sampling 
in five or more 3-deg-wide phase-angle bins (which have reli¬ 
able Gi 2 fits). Using these tables one can produce plots of the 
PTE lightcurves we have analyzed in this work. 
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Table 4 

Parameters describing PTF lightcurves with a reliable period or phase function. Byte-by-byte Description of file: ptf_asteroid_lc_parameters . txt 


Bytes 

Format 

Units 

Label 

Explanations 

1-4 

14 

— 


Lightcurve ID number^ 

6- 11 

16 

— 


Asteroid number (lAU designation) 

13-14 

12 

yr 


Last two digits of opposition year 

16 

11 

— 


Photometric band: 1 = Gunn-p, 2 = Mould-i? 

18- 20 

13 

— 


Number of observations in the lightcurve 

22- 26 

F5.2 

mag 


Median apparent magnitude 

28- 37 

F10.5 

day 

/■min 

Time (MJD) of first observation 

39- 48 

F10.5 

day 

/^max 

Time (MJD) of final observation 

50- 54 

F5.2 

deg 

amin 

Minimum-observed phase angle 

56- 60 

F5.2 

deg 

ttmax 

Maximum-observed phase angle 

62- 63 

12 

— 


Number of sampled phase-angle bins of 3-deg width 

65- 68 

F4.2 

— 

p 

Reliability score from machine classifier: 0=bad, l=good 

70 

11 

— 


Manually-assigned reliability flag: 0=bad, l=good 

72 

11 

— 


Period reliability flag: 0=bad, l=good (product of two previous columns) 

74- 79 

F6.3 

mag 

H 

Absolute magnitude from G 12 fit 

81-85 

F5.3 

mag 


Uncertainty in absolute magnitude from G 12 fit 

87-91 

F5.3 

— 

Gi2 

Phase-function parameter G 12 

92- 98 

F6.3 

— 


Uncertainty in Gi 2 ^ 

100-105 

F6.3 

— 

G 

Phase-function parameter G 

107-113 

F7.4 

mag/deg 

0 

Phase-function parameter /3 

115-119 

F6.3 

mag 

C 

Phase-function parameter G 

121-124 

F4.2 

mag 


Amplitude from G 12 fit (max — min) 

126-134 

F9.4 

hr 

P 

Period from G 12 fit 

136-144 

F9.4 

hr 


Period uncertainty from G 12 fi 

146-152 

F7.4 

mag 

All 

Fourier coefficient Ai^i from G12 fit 

154-160 

F7.4 

mag 

Ai2 

Fourier coefficient Ai ,2 from G12 fit 

162-168 

F7.4 

mag 

A 21 

Fourier coefficient A 2 ,i from G12 fit 

170-176 

F7.4 

mag 

A 22 

Fourier coefficient A 2,2 from G12 fit 

178-181 

F4.2 

— 


Ratio of the two peak heights in folded rotation curve^ 

183-186 

F4.2 

— 

X?ed 

Reduced chi-squared of the fit 

188-192 

F5.3 

mag 


’’Cosmic error” (see Section 4.1) 

194-198 

F5.3 

mag 


Root-mean-square residual of observations w.r.t the fit 

200-206 

F7.3 

hr 


Reference period (from http://sbn.psi.edu/pds/resource/lc) 

208-213 

F6.2 

km 

D 

Diameter derived from thermal IR data'^ 

215-218 

F4.2 

km 


Uncertainty in diameter 

220-224 

F5.3 

— 

-^bond 

Bond albedo® 

226-231 

F6.4 

— 


Uncertainty in bond albedo 

233-236 

F4.2 

— 


Color-based taxonomic index: 0=C-type, l=S-type 

238-241 

F4.2 

— 


Photometi^-based taxonomic index: 0=C-type, l=S-type 


^ID number labels individual observations in Table 5. 

^Set to —1 if larger than the interval tested in grid search 
^Set to 0 if there is only one maximum in the folded lightcurve 
“^References for the IR diameters are given in the text (appendix) 

^Bond albedo only computed for objects with reliable G 12 and available diameter 
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Table 5 

Parameters describing PTF lightcurves with a reliable period or phase function. Byte-by-byte Description of file: ptf-asteroid-lc-observations . txt 


Bytes 

Format 

Units 

Label 

Explanations 

1-4 

14 



Lightcurve ID number^ 

6- 15 

F10.5 

day 

r 

Light-time-corrected observation epoch 

17- 26 

F10.7 

AU 

r 

Heliocentric distance 

28- 37 

F10.7 

AU 

A 

Geocentric distance 

39- 43 

F5.2 

deg 

a 

Solar phase angle 

45-50 

F6.3 

mag 

R or g 

Apparent magnitude^ 

52- 56 

F5.3 

mag 


Uncertainty in apparent magnitude 

58- 62 

F5.3 

mag 


Uncertainty in apparent magnitude with cosmic-error 

64- 69 

F6.3 

mag 


Magnitude corrected for distance and G 12 phase function 

71-76 

F6.3 

mag 


Magnitude corrected for distance and rotation (G 12 fit) 

78- 83 

F6.3 

mag 


Residual with respect to the G 12 fit 

85- 89 

F4.1 



Rotational phase from 0 to 1 (G 12 fit) 


^ID number also corresponds to the line number in Table 4. 
^ Filter/band is specified in Table 4. 
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